Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?

Eliseev, Alexander; Seleznev, Sergei

Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?

Alexander Eliseev and Sergei Seleznev

Abstract: Large language models (LLMs) are a type of machine learning tool that economists have started to apply in their empirical research. One such application is macroeconomic forecasting with backtesting of LLMs, even though they are trained on the same data that is used to estimate their forecasting performance. Can these in-sample accuracy results be extrapolated to the model's out-of-sample performance? To answer this question, we developed a family of prompt sensitivity tests and two members of this family, which we call the fake date tests. These tests aim to detect two types of biases in LLMs' in-sample forecasts: lookahead bias and context bias. According to the empirical results, none of the modern LLMs tested in this study passed our first test, signaling the presence of lookahead bias in their in-sample forecasts.

Date: 2026-01
New Economics Papers: this item is included in nep-ain, nep-cmp and nep-for
References: Add references at CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2601.07992 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2601.07992

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().