Model selection and error estimation
Peter L. Bartlett,
Stéphane Boucheron and
Gabor Lugosi
Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra
Abstract:
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical {\sc vc} dimension, empirical {\sc vc} entropy, and margin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.
Keywords: Complexity regularization; model selection; error estimation; concentration of measure (search for similar items in EconPapers)
JEL-codes: C13 C14 (search for similar items in EconPapers)
Date: 2000-10
New Economics Papers: this item is included in nep-ecm
References: Add references at CitEc
Citations: View citations in EconPapers (11)
Downloads: (external link)
https://econ-papers.upf.edu/papers/508.pdf Whole Paper (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:upf:upfgen:508
Access Statistics for this paper
More papers in Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra
Bibliographic data for series maintained by ( this e-mail address is bad, please contact ).