ChatMacro: Evaluating Inflation Forecasts of Generative AI
M. Jahangir Alam,
Shane Boyle,
Huiyu Li and
Tatevik Sekhposyan
No 21057, CEPR Discussion Papers from Centre for Economic Policy Research
Abstract:
Recent research suggests that generic large language models (LLMs) can match the accuracy of traditional methods when forecasting macroeconomic variables in pseudo out-of-sample settings generated via prompts. This paper assesses the out-of-sample forecasting accuracy of LLMs by eliciting real-time forecasts of U.S. inflation from ChatGPT. We find that out-of-sample predictions are largely inaccurate and stale, even though forecasts generated in pseudo out-of-sample environments are comparable to existing benchmarks. Our results underscore the importance of out-of-sample benchmarking for LLM predictions.
Date: 2026-01
References: Add references at CitEc
Citations:
Downloads: (external link)
https://cepr.org/publications/DP21057 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cpr:ceprdp:21057
Ordering information: This working paper can be ordered from
https://cepr.org/publications/DP21057
Access Statistics for this paper
More papers in CEPR Discussion Papers from Centre for Economic Policy Research 33 Great Sutton Street, London EC1V 0DX, UK.
Bibliographic data for series maintained by CEPR ().