EconPapers    
Economics at your fingertips  
 

The One Standard Error Rule for Model Selection: Does It Work?

Yuchen Chen and Yuhong Yang
Additional contact information
Yuchen Chen: Carlson School of Management, University of Minnesota, Minneapolis, MN 55455, USA
Yuhong Yang: School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA

Stats, 2021, vol. 4, issue 4, 1-25

Abstract: Previous research provided a lot of discussion on the selection of regularization parameters when it comes to the application of regularization methods for high-dimensional regression. The popular “One Standard Error Rule” (1se rule) used with cross validation (CV) is to select the most parsimonious model whose prediction error is not much worse than the minimum CV error. This paper examines the validity of the 1se rule from a theoretical angle and also studies its estimation accuracy and performances in applications of regression estimation and variable selection, particularly for Lasso in a regression framework. Our theoretical result shows that when a regression procedure produces the regression estimator converging relatively fast to the true regression function, the standard error estimation formula in the 1se rule is justified asymptotically. The numerical results show the following: 1. the 1se rule in general does not necessarily provide a good estimation for the intended standard deviation of the cross validation error. The estimation bias can be 50–100% upwards or downwards in various situations; 2. the results tend to support that 1se rule usually outperforms the regular CV in sparse variable selection and alleviates the over-selection tendency of Lasso; 3. in regression estimation or prediction, the 1se rule often performs worse. In addition, comparisons are made over two real data sets: Boston Housing Prices (large sample size n , small/moderate number of variables p ) and Bardet–Biedl data (large p , small n ). Data guided simulations are done to provide insight on the relative performances of the 1se rule and the regular CV.

Keywords: subsampling; tuning parameter selection; estimation accuracy; regression estimation; variable selection (search for similar items in EconPapers)
JEL-codes: C1 C10 C11 C14 C15 C16 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2571-905X/4/4/51/pdf (application/pdf)
https://www.mdpi.com/2571-905X/4/4/51/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jstats:v:4:y:2021:i:4:p:51-892:d:673038

Access Statistics for this article

Stats is currently edited by Mrs. Minnie Li

More articles in Stats from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jstats:v:4:y:2021:i:4:p:51-892:d:673038