EconPapers    
Economics at your fingertips  
 

Best subset selection via cross-validation criterion

Yuichi Takano () and Ryuhei Miyashiro
Additional contact information
Yuichi Takano: University of Tsukuba
Ryuhei Miyashiro: Tokyo University of Agriculture and Technology

TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, 2020, vol. 28, issue 2, No 10, 475-488

Abstract: Abstract This paper is concerned with the cross-validation criterion for selecting the best subset of explanatory variables in a linear regression model. In contrast with the use of statistical criteria (e.g., Mallows’ $$C_p$$Cp, the Akaike information criterion, and the Bayesian information criterion), cross-validation requires only mild assumptions, namely, that samples are identically distributed and that training and validation samples are independent. For this reason, the cross-validation criterion is expected to work well in most situations involving predictive methods. The purpose of this paper is to establish a mixed-integer optimization approach to selecting the best subset of explanatory variables via the cross-validation criterion. This subset-selection problem can be formulated as a bilevel MIO problem. We then reduce it to a single-level mixed-integer quadratic optimization problem, which can be solved exactly by using optimization software. The efficacy of our method is evaluated through simulation experiments by comparison with statistical-criterion-based exhaustive search algorithms and $$L_1$$L1-regularized regression. Our simulation results demonstrate that, when the signal-to-noise ratio was low, our method delivered good accuracy for both subset selection and prediction.

Keywords: Integer programming; Subset selection; Cross-validation; Ridge regression; Statistics; 62F07; 62J05; 90C11; 90C90 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (5)

Downloads: (external link)
http://link.springer.com/10.1007/s11750-020-00538-1 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:topjnl:v:28:y:2020:i:2:d:10.1007_s11750-020-00538-1

Ordering information: This journal article can be ordered from
http://link.springer.de/orders.htm

DOI: 10.1007/s11750-020-00538-1

Access Statistics for this article

TOP: An Official Journal of the Spanish Society of Statistics and Operations Research is currently edited by Juan José Salazar González and Gustavo Bergantiños

More articles in TOP: An Official Journal of the Spanish Society of Statistics and Operations Research from Springer, Sociedad de Estadística e Investigación Operativa
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:topjnl:v:28:y:2020:i:2:d:10.1007_s11750-020-00538-1