Using Predicted Outcome Stratified Sampling to Reduce the Variability in Predictive Performance of a One-Shot Train-and-Test Split for Individual Customer Predictions
G. Verstraeten and
Dirk Van den Poel
Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium from Ghent University, Faculty of Economics and Business Administration
Abstract:
Since it is generally recognized that models evaluated on the data that was used for constructing them are overly optimistic, in predictive modeling practice, the assessment of a model’s predictive performance frequently relies on a one-shot train-and-test split between observations used for estimating a model, and those used for validating it. Previous research has indicated the usefulness of stratified sampling for reducing the variation in predictive performance in a linear regression application. In this paper, we validate the previous findings on six real-life European predictive modeling applications for marketing and credit scoring using a dichotomous outcome variable. We find confirmation for the reduction in variability using a procedure we describe as predicted outcome stratified sampling in a logistic regression model, and we find that the gain in variation reduction is – also in large data sets – almost always significant, and in certain applications markedly high.
Pages: 10 pages
Date: 2006-01
New Economics Papers: this item is included in nep-ecm and nep-mkt
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://wps-feb.ugent.be/Papers/wp_06_360.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:rug:rugwps:06/360
Access Statistics for this paper
More papers in Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium from Ghent University, Faculty of Economics and Business Administration Contact information at EDIRC.
Bibliographic data for series maintained by Nathalie Verhaeghe ().