Stabilizing the lasso against cross-validation variability
S. Roberts and
G. Nowak
Computational Statistics & Data Analysis, 2014, vol. 70, issue C, 198-211
Abstract:
An abundance of high-dimensional data has meant that L1 penalized regression, known as the lasso, has become an indispensable tool of the practitioner. A feature of the lasso is a “tuning” parameter that controls the amount of shrinkage applied to the coefficients. In practice, a value for the tuning parameter is chosen using the method of cross-validation. It is shown that the model that is selected by the lasso can be extremely sensitive to the fold assignment used for cross-validation. A consequence of this sensitivity is that the results from a lasso analysis can lack interpretability. To overcome this model-selection instability of the lasso, a method called the percentile-lasso is introduced. The model selected by the percentile-lasso corresponds to the model selected by the lasso, when the lasso is fitted using an appropriate percentile of the possible “optimal” tuning parameter values. It is demonstrated that the percentile-lasso can achieve substantial improvements in both model-selection stability and model-selection error compared to the lasso. Importantly, when applied to real data the percentile-lasso, unlike the lasso, produces interpretable results, that is, results that are robust to the assignment of observations to folds for cross-validation. The percentile-lasso is easily applied to extensions of the lasso and in the context of penalized generalized linear models.
Keywords: Model-selection; p≫n; Penalized regression; Regularization; Shrinkage (search for similar items in EconPapers)
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (5)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S016794731300323X
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:70:y:2014:i:c:p:198-211
DOI: 10.1016/j.csda.2013.09.008
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().