ChatGPT Hallucinates Non-existent Citations: Evidence from Economics
Joy Buchanan,
Stephen Hill and
Olga Shapoval
The American Economist, 2024, vol. 69, issue 1, 80-87
Abstract:
In this study, we generate prompts derived from every topic within the Journal of Economic Literature to assess the abilities of both GPT-3.5 and GPT-4 versions of the ChatGPT large language model (LLM) to write about economic concepts. ChatGPT demonstrates considerable competency in offering general summaries but also cites non-existent references. More than 30% of the citations provided by the GPT-3.5 version do not exist and this rate is only slightly reduced for the GPT-4 version. Additionally, our findings suggest that the reliability of the model decreases as the prompts become more specific. We provide quantitative evidence for errors in ChatGPT output to demonstrate the importance of LLM verification. JEL Codes: B4; O33; I2
Keywords: artificial intelligence; large language models; ChatGPT; writing; research methods (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.sagepub.com/doi/10.1177/05694345231218454 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:sae:amerec:v:69:y:2024:i:1:p:80-87
DOI: 10.1177/05694345231218454
Access Statistics for this article
More articles in The American Economist from Sage Publications
Bibliographic data for series maintained by SAGE Publications (sagediscovery@sagepub.com).