Super Learning: An Application to the Prediction of HIV-1 Drug Resistance
Sinisi Sandra E.,
Polley Eric C,
Petersen Maya L,
Rhee Soo-Yon and
J. van der Laan Mark
Additional contact information
Sinisi Sandra E.: University of California, Berkeley
Polley Eric C: University of California, Berkeley
Petersen Maya L: University of California, Berkeley
Rhee Soo-Yon: Stanford University
J. van der Laan Mark: Division of Biostatistics, School of Public Health, University of California, Berkeley
Statistical Applications in Genetics and Molecular Biology, 2007, vol. 6, issue 1, 1-26
Many alternative data-adaptive algorithms can be used to learn a predictor based on observed data. Examples of such learners include decision trees, neural networks, support vector regression, least angle regression, logic regression, and the Deletion/Substitution/Addition algorithm. The optimal learner for prediction will vary depending on the underlying data-generating distribution. In this article we introduce the "super learner", a prediction algorithm that applies any set of candidate learners and uses cross-validation to select between them. Theory shows that asymptotically the super learner performs essentially as well as or better than any of the candidate learners. In this article we present the theory behind the super learner, and illustrate its performance using simulations. We further apply the super learner to a data example, in which we predict the phenotypic antiretroviral susceptibility of HIV based on viral genotype. Specifically, we apply the super learner to predict susceptibility to a specific protease inhibitor, nelfinavir, using a set of database-derived non-polymorphic treatment-selected mutations.
References: Add references at CitEc
Citations: View citations in EconPapers (2) Track citations by RSS feed
Downloads: (external link)
For access to full text, subscription to the journal or payment for the individual article is required.
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:6:y:2007:i:1:n:7
Ordering information: This journal article can be ordered from
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().