EconPapers    
Economics at your fingertips  
 

ChatMacro: Evaluating Inflation Forecasts of Generative AI

M.Jahangir Alam (), Shane Boyle (), Huiyu Li and Tatevik Sekhposyan

No 2026-04, Working Paper Series from Federal Reserve Bank of San Francisco

Abstract: Recent research suggests that generic large language models (LLMs) can match the accuracy of traditional methods when forecasting macroeconomic variables in pseudo out-of-sample settings generated via prompts. This paper assesses the out-of-sample forecasting accuracy of LLMs by eliciting real-time forecasts of U.S. inflation from ChatGPT. We find that out-of-sample predictions are largely inaccurate and stale, even though forecasts generated in pseudo out-of-sample environments are comparable to existing benchmarks. Our results underscore the importance of out-of-sample benchmarking for LLM predictions.

Keywords: large language models; generative AI; inflation forecasting (search for similar items in EconPapers)
JEL-codes: C45 E31 E37 (search for similar items in EconPapers)
Pages: 24
Date: 2026-02-05
Note: PDF date: January 27, 2006.
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.frbsf.org/wp-content/uploads/wp2026-04.pdf PDF - view (application/pdf)
https://www.frbsf.org/research-and-insights/public ... ts-generative-of-ai/ FRBSF - view (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:fip:fedfwp:102407

Ordering information: This working paper can be ordered from

DOI: 10.24148/wp2026-04

Access Statistics for this paper

More papers in Working Paper Series from Federal Reserve Bank of San Francisco Contact information at EDIRC.
Bibliographic data for series maintained by Federal Reserve Bank of San Francisco Research Library ().

 
Page updated 2026-02-06
Handle: RePEc:fip:fedfwp:102407