A comparison of machine learning model validation schemes for non-stationary time series data
Matthias Schnaubelt
No 11/2019, FAU Discussion Papers in Economics from Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics
Abstract:
Machine learning is increasingly applied to time series data, as it constitutes an attractive alternative to forecasts based on traditional time series models. For independent and identically distributed observations, cross-validation is the prevalent scheme for estimating out-of-sample performance in both model selection and assessment. For time series data, however, it is unclear whether forwardvalidation schemes, i.e., schemes that keep the temporal order of observations, should be preferred. In this paper, we perform a comprehensive empirical study of eight common validation schemes. We introduce a study design that perturbs global stationarity by introducing a slow evolution of the underlying data-generating process. Our results demonstrate that, even for relatively small perturbations, commonly used cross-validation schemes often yield estimates with the largest bias and variance, and forward-validation schemes yield better estimates of the out-of-sample error. We provide an interpretation of these results in terms of an additional evolution-induced bias and the sample-size dependent estimation error. Using a large-scale financial data set, we demonstrate the practical significance in a replication study of a statistical arbitrage problem. We conclude with some general guidelines on the selection of suitable validation schemes for time series data.
Keywords: machine learning; model selection; model validation; time series; cross-validation (search for similar items in EconPapers)
Date: 2019
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ecm, nep-ets and nep-pay
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (9)
Downloads: (external link)
https://www.econstor.eu/bitstream/10419/209136/1/1684440068.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:zbw:iwqwdp:112019
Access Statistics for this paper
More papers in FAU Discussion Papers in Economics from Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics Contact information at EDIRC.
Bibliographic data for series maintained by ZBW - Leibniz Information Centre for Economics ().