Model Selection Based on FDR-Thresholding Optimizing the Area under the ROC-Curve
Graf Alexandra C. and
Bauer Peter
Additional contact information
Graf Alexandra C.: Medical University of Vienna
Bauer Peter: Medical University of Vienna
Statistical Applications in Genetics and Molecular Biology, 2009, vol. 8, issue 1, 22
Abstract:
We evaluate variable selection by multiple tests controlling the false discovery rate (FDR) to build a linear score for prediction of clinical outcome in high-dimensional data. Quality of prediction is assessed by the receiver operating characteristic curve (ROC) for prediction in independent patients. Thus we try to combine both goals: prediction and controlled structure estimation. We show that the FDR-threshold which provides the ROC-curve with the largest area under the curve (AUC) varies largely over the different parameter constellations not known in advance. Hence, we investigated a new cross validation procedure based on the maximum rank correlation estimator to determine the optimal selection threshold. This procedure (i) allows choosing an appropriate selection criterion, (ii) provides an estimate of the FDR close to the true FDR and (iii) is simple and computationally feasible for rather moderate to small sample sizes. Low estimates of the cross validated AUC (the estimates generally being positively biased) and large estimates of the cross validated FDR may indicate a lack of sufficiently prognostic variables and/or too small sample sizes. The method is applied to an oncology dataset.
Keywords: variable selection; FDR; ROC-curve; cross validation (search for similar items in EconPapers)
Date: 2009
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://doi.org/10.2202/1544-6115.1462 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:31
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.2202/1544-6115.1462
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().