Ensembles of Overfit and Overconfident Forecasts
Yael Grushka-Cockayne (),
Victor Richmond R. Jose () and
Kenneth C. Lichtendahl ()
Additional contact information
Yael Grushka-Cockayne: Darden School of Business, University of Virginia, Charlottesville, Virginia 22903
Victor Richmond R. Jose: McDonough School of Business, Georgetown University, Washington, DC 20057
Kenneth C. Lichtendahl: Darden School of Business, University of Virginia, Charlottesville, Virginia 22903
Management Science, 2017, vol. 63, issue 4, 1110-1130
Abstract:
Firms today average forecasts collected from multiple experts and models. Because of cognitive biases, strategic incentives, or the structure of machine-learning algorithms, these forecasts are often overfit to sample data and are overconfident. Little is known about the challenges associated with aggregating such forecasts. We introduce a theoretical model to examine the combined effect of overfitting and overconfidence on the average forecast. Their combined effect is that the mean and median probability forecasts are poorly calibrated with hit rates of their prediction intervals too high and too low, respectively. Consequently, we prescribe the use of a trimmed average, or trimmed opinion pool, to achieve better calibration. We identify the random forest, a leading machine-learning algorithm that pools hundreds of overfit and overconfident regression trees, as an ideal environment for trimming probabilities. Using several known data sets, we demonstrate that trimmed ensembles can significantly improve the random forest’s predictive accuracy.
Keywords: wisdom of crowds; base-rate neglect; linear opinion pool; trimmed opinion pool; hit rate; calibration; random forest; data science (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (14)
Downloads: (external link)
https://doi.org/10.1287/mnsc.2015.2389 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:63:y:2017:i:4:p:1110-1130
Access Statistics for this article
More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().