Artificial intelligence and dichotomania
Blakeley B. McShane,
David Gal and
Adam Duhachek
Judgment and Decision Making, 2025, vol. 20, -
Abstract:
Large language models (LLMs) such as ChatGPT, Gemini, and Claude are increasingly being used in aid or place of human judgment and decision making. Indeed, academic researchers are increasingly using LLMs as a research tool. In this paper, we examine whether LLMs, like academic researchers, fall prey to a particularly common human error in interpreting statistical results, namely ‘dichotomania’ that results from the dichotomization of statistical results into the categories ‘statistically significant’ and ‘statistically nonsignificant’. We find that ChatGPT, Gemini, and Claude fall prey to dichotomania at the 0.05 and 0.10 thresholds commonly used to declare ‘statistical significance’. In addition, prompt engineering with principles taken from an American Statistical Association Statement on Statistical Significance and P-values intended as a corrective to human errors does not mitigate this and arguably exacerbates it. Further, more recent and larger versions of these models do not necessarily perform better. Finally, these models sometimes provide interpretations that are not only incorrect but also highly erratic.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.cambridge.org/core/product/identifier/ ... type/journal_article link to article abstract page (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cup:judgdm:v:20:y:2025:i::p:-_23
Access Statistics for this article
More articles in Judgment and Decision Making from Cambridge University Press Cambridge University Press, UPH, Shaftesbury Road, Cambridge CB2 8BS UK.
Bibliographic data for series maintained by Kirk Stebbing ().