EconPapers    
Economics at your fingertips  
 

Performance and estimation of the true error rate of classification rules built with additional information. An application to a cancer trial

Conde David, Salvador Bonifacio, Rueda Cristina and Fernández Miguel A. ()
Additional contact information
Conde David: Departamento de Estadística e I.O.,Universidad de Valladolid, 47011 Valladolid, Spain
Salvador Bonifacio: Departamento de Estadística e I.O.,Universidad de Valladolid, 47011 Valladolid, Spain
Rueda Cristina: Departamento de Estadística e I.O.,Universidad de Valladolid, 47011 Valladolid, Spain
Fernández Miguel A.: Departamento de Estadística e I.O.,Universidad de Valladolid, 47011 Valladolid, Spain

Statistical Applications in Genetics and Molecular Biology, 2013, vol. 12, issue 5, 583-602

Abstract: Classification rules that incorporate additional information usually present in discrimination problems are receiving certain attention during the last years as they perform better than the usual rules. Fernández, M. A., C. Rueda and B. Salvador (2006): “Incorporating additional information to normal linear discriminant rules,” J. Am. Stat. Assoc., 101, 569–577, proved that these rules have lower total misclassification probability than the usual Fisher’s rule. In this paper we consider two issues; on the one hand, we compare these rules with those based on shrinkage estimators of the mean proposed by Tong, T., L. Chen and H. Zhao (2012): “Improved mean estimation and its application to diagonal discriminant analysis,” Bioinformatics, 28(4): 531–537. with regard to four criteria: total misclassification probability, area under ROC curve, well-calibratedness and refinement; on the other hand, we consider the estimation of the true error rate, which is a very interesting parameter in applications. We prove results on the apparent error rate of the rules that expose the need of new estimators of their true error rate. We propose four such new estimators. Two of them are defined incorporating the additional information into the leave-one-out-bootstrap. The other two are the corresponding cross-validation after bootstrap versions. We compare these estimators with the usual ones in a simulation study and in a cancer trial application, showing the good behavior of the rules that incorporate additional information and of the new leave-one-out bootstrap estimators of their true error rate.

Keywords: area under ROC curve; bootstrap; cancer diagnostic test research; discriminant analysis; order restrictions; true error rate (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/sagmb-2012-0037 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:12:y:2013:i:5:p:583-602:n:3

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.1515/sagmb-2012-0037

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:12:y:2013:i:5:p:583-602:n:3