Two-Stage Classification with SIS Using a New Filter Ranking Method in High Throughput Data
Sangjin Kim and
Jong-Min Kim
Additional contact information
Sangjin Kim: Department of Mathematical Sciences, University of Texas at El Paso, El Paso, TX 79968, USA
Jong-Min Kim: Division of Sciences and Mathematics, University of Minnesota at Morris, Morris, MN 56267, USA
Mathematics, 2019, vol. 7, issue 6, 1-16
Abstract:
Over the last decade, high dimensional data have been popularly paid attention to in bioinformatics. These data increase the likelihood of detecting the most promising novel information. However, there are limitations of high-performance computing and overfitting issues. To overcome the issues, alternative strategies need to be explored for the detection of true important features. A two-stage approach, filtering and variable selection steps, has been receiving attention. Filtering methods are divided into two categories of individual ranking and feature subset selection methods. Both have issues with the lack of consideration for joint correlation among features and computing time of an NP-hard problem. Therefore, we proposed a new filter ranking method (PF) using the elastic net penalty with sure independence screening (SIS) based on resampling technique to overcome these issues. We demonstrated that SIS-LASSO, SIS-MCP, and SIS-SCAD with the proposed filtering method achieved superior performance of not only accuracy, AUROC, and geometric mean but also true positive detection compared to those with the marginal maximum likelihood ranking method (MMLR) through extensive simulation studies. In addition, we applied it in a real application of colon and lung cancer gene expression data to investigate the classification performance and power of detecting true genes associated with colon and lung cancer.
Keywords: LASSO; SCAD; MCP; SIS; elastic net; accuracy; AUROC; geometric mean (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/7/6/493/pdf (application/pdf)
https://www.mdpi.com/2227-7390/7/6/493/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:7:y:2019:i:6:p:493-:d:235459
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().