Comprehensive OOS Evaluation of Predictive Algorithms with Statistical Decision Theory
Jeff Dominitz and
Charles F. Manski
Papers from arXiv.org
Abstract:
We argue that comprehensive out-of-sample (OOS) evaluation using statistical decision theory (SDT) should replace the current practice of K-fold and Common Task Framework validation in machine learning (ML) research on prediction. SDT provides a formal frequentist framework for performing comprehensive OOS evaluation across all possible (1) training samples, (2) populations that may generate training data, and (3) populations of prediction interest. Regarding feature (3), we emphasize that SDT requires the practitioner to directly confront the possibility that the future may not look like the past and to account for a possible need to extrapolate from one population to another when building a predictive algorithm. For specificity, we consider treatment choice using conditional predictions with alternative restrictions on the state space of possible populations that may generate training data. We discuss application of SDT to the problem of predicting patient illness to inform clinical decision making. SDT is simple in abstraction, but it is often computationally demanding to implement. We call on ML researchers, econometricians, and statisticians to expand the domain within which implementation of SDT is tractable.
Date: 2024-03, Revised 2025-04
New Economics Papers: this item is included in nep-big
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2403.11016 Latest version (application/pdf)
Related works:
Working Paper: Comprehensive OOS Evaluation of Predictive Algorithms with Statistical Decision Theory (2024) 
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2403.11016
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().