Who is More Bayesian: Humans or ChatGPT?

Rust, John; Mu, Tianshi; Rawat, Pranjal; Zhang, Chengjun; Zhong, Qixuan

Who is More Bayesian: Humans or ChatGPT?

John Rust (), Tianshi Mu (), Pranjal Rawat (), Chengjun Zhang () and Qixuan Zhong ()
Additional contact information
John Rust: Department of Economics, Georgetown University, https://editorialexpress.com/jrust
Tianshi Mu: Tsinghua University, https://tianshimu.netlify.app/
Pranjal Rawat: Georgetown University, https://github.com/rawatpranjal
Chengjun Zhang: Morgan Stanley
Qixuan Zhong: Department of Economics, Georgetown University, https://econ.georgetown.edu

Working Papers from Georgetown University, Department of Economics

Abstract: We compare human and artificially intelligent (AI) subjects in classification tasks where the optimal decision rule is given by Bayes’ Rule. Experimental studies reach mixed conclusions about whether human beliefs and decisions accord with Bayes’ Rule. We reanalyze land- mark experiments using a new model of decision making and show that decisions can be nearly optimal even when beliefs are not Bayesian. Using an objective measure of “decision efficiency,” we find that humans are 96% efficient despite the fact that only a minority have Bayesian beliefs. We replicate these same experiments using three generations of ChatGPT as subjects. Using the reasoning provided by GPT responses to understand its “thought process,” we find that GPT-3.5 ignores the prior and is only 75% efficient, whereas GPT-4 and GPT-4o use Bayes’ Rule and are 93% and 99% efficient, respectively. Most errors by GPT-4 and GPT-4o are algebraic mistakes in computing the posterior, but GPT-4o is far less error-prone. GPT performance increased from sub-human to super-human in just 3 years. By version 4o, its beliefs and decision making had become nearly perfectly Bayesian.

Keywords: Bayes’ Rule; decision making; statistical decision theory; win and loss func- tions; learning; Bayes’ compatible beliefs; noisy Bayesians; classification; machine learning; artificial intelligence; large language models; ChatGPT; maximum likelihood; heterogeneity; mixture models; Estimation-Classification (EC) algorithm; binary logit model; structural models (search for similar items in EconPapers)
JEL-codes: C91 D91 (search for similar items in EconPapers)
Pages: 66
Date: 2025-07-10
New Economics Papers: this item is included in nep-ain, nep-cmp and nep-exp
References: Add references at CitEc
Citations:

Downloads: (external link)
https://arxiv.org/abs/2504.10636 Full text (application/pdf)
None

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:geo:guwopa:gueconwpa~25-25-02

Ordering information: This working paper can be ordered from
Roger Lagunoff Professor of Economics Georgetown University Department of Economics Washington, DC 20057-1036
http://econ.georgetown.edu/

Access Statistics for this paper

More papers in Working Papers from Georgetown University, Department of Economics Georgetown University Department of Economics Washington, DC 20057-1036.
Bibliographic data for series maintained by Marcia Suss ( this e-mail address is bad, please contact ).