Fake Date Tests: Can We Trust In-sample Accuracy of LLMs in Macroeconomic Forecasting?
Alexander Eliseev () and
Sergei Seleznev ()
Additional contact information
Alexander Eliseev: Bank of Russia, Russian Federation
Sergei Seleznev: Bank of Russia, Russian Federation
No wps167, Bank of Russia Working Paper Series from Bank of Russia
Abstract:
Large language models (LLMs) are a type of machine learning tool that economists have started to apply in their empirical research. One such application is macroeconomic forecasting with backtesting of LLMs, even though they are trained on the same data that is used to estimate their forecasting performance. Can these in-sample accuracy results be extrapolated to the model’s out-of-sample performance? To answer this question, we developed a family of prompt sensitivity tests and two members of this family, which we call the fake date tests. These tests aim to detect two types of biases in LLMs’ in-sample forecasts: lookahead bias and context bias. According to the empirical results, none of the modern LLMs tested in this study passed our tests, signaling the presence of biases in their in-sample forecasts.
Keywords: large language models; macroeconomic forecasting; lookahead bias; context bias (search for similar items in EconPapers)
JEL-codes: C12 C52 C53 (search for similar items in EconPapers)
Pages: 131 pages
Date: 2026-03
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.cbr.ru/StaticHtml/File/188046/wp_167.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bkr:wpaper:wps167
Access Statistics for this paper
More papers in Bank of Russia Working Paper Series from Bank of Russia Contact information at EDIRC.
Bibliographic data for series maintained by BoR Research ( this e-mail address is bad, please contact ).