Gene Expression Data Analysis Using Machine Learning And Deep Learning Techniques For Cancer Microarray Data
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Microarray technology has emerged as a pivotal tool in cancer research, providing a highthroughput
newlineplatform for simultaneously analyzing gene expression profiles across thousands
newlineof genes. This technology has revolutionized cancer detection and classification by enabling
newlineresearchers to uncover critical biomarkers, identify molecular subtypes of cancer, and predict
newlineclinical outcomes with remarkable precision. Despite its immense potential, the effective
newlineutilization of microarray data in cancer classification poses significant challenges due to its
newlineunique characteristics and inherent complexities. Microarray datasets are typically
newlinecharacterized by a high dimensionality of features and thousands of gene expressions
newlinecombined with a limited number of samples. This phenomenon, often called the quotlarge p,
newlinesmall nquot problem, severely impacts the performance of machine learning models, leading to
newlineoverfitting and poor generalization of unseen data. Additionally, noise and variability arising
newlinefrom technical inconsistencies in sample preparation and processing further complicate the
newlinedevelopment of reliable and robust cancer detection models.
newlineThis thesis addresses these challenges through innovative methodologies to enhance the
newlineaccuracy, reliability, and interpretability of cancer detection using microarray data. We
newlinepropose an advanced feature selection framework incorporating the Improved Binary Grey
newlineWolf Optimizer (IBGWO) to tackle the high dimensionality issue. This optimization
newlinealgorithm effectively selects an optimal subset of features, significantly reducing the
newlinedimensionality while retaining the most relevant and informative gene expressions. The
newlinereduced feature set mitigates overfitting and ensures the predictive models achieve superior
newlinerobustness and accuracy. Furthermore, we introduce hybrid feature selection methods that
newlinesynergize filter and wrapper techniques. In this two-tiered approach, filter methods serve as an
newlineinitial screening mechanism to eliminate irrelevant features, while wrapper methods, such as
newlineMoth-flam