A hybrid feature selection method combining Gini index and support vector machine with recursive feature elimination for gene expression classification
Talal Almutiri and
Faisal Saeed
International Journal of Data Mining, Modelling and Management, 2022, vol. 14, issue 1, 41-62
Abstract:
Microarray datasets are suffering from a curse of dimensionality, because of a large number of genes and low numbers of samples, wherefore, the high dimensionality leads to computational cost and complexity. Consequently, feature selection (FS) is the process of choosing informative genes that could help in improving the effectiveness of classification. In this study, a hybrid feature selection was proposed, which combines the Gini index and support vector machine with recursive feature elimination (GI-SVM-RFE), calculates a weight for each gene and recursively selects only ten genes to be the informative genes. To measure the impact of the proposed method, the experiments include four scenarios: baseline without feature selection, GI feature selection, SVM-RFE feature selection, and combining GI with SVM-RFE. In this paper, 11 microarray datasets were used. The proposed method showed an improvement in terms of classification accuracy when compared with other previous studies.
Keywords: classification; feature selection; gene expression; Gini index; microarray; recursive feature elimination. (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=122038 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:14:y:2022:i:1:p:41-62
Access Statistics for this article
More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().