Large-scale experimental investigation of the reliability of confidence measures
Clémentine Bouleau (),
Maël Lebreton () and
Nicolas Jacquemet ()
Additional contact information
Clémentine Bouleau: PSE - Paris School of Economics - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement - ENPC - École nationale des ponts et chaussées - IP Paris - Institut Polytechnique de Paris, CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique
Maël Lebreton: PSE - Paris School of Economics - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement - ENPC - École nationale des ponts et chaussées - IP Paris - Institut Polytechnique de Paris, PJSE - Paris Jourdan Sciences Economiques - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement - ENPC - École nationale des ponts et chaussées - IP Paris - Institut Polytechnique de Paris
PSE-Ecole d'économie de Paris (Postprint) from HAL
Abstract:
Whether individuals feel confident about their own actions, choices, or statements being correct, and how these confidence levels differ between individuals are two key primitives for countless behavioral theories and phenomena. In cognitive tasks, individual confidence is typically measured as the average of reports about choice accuracy, but how reliable is the resulting characterization of within- and between-individual confidence remains surprisingly undocumented. Here, we perform a large-scale resampling exercise in the Confidence Database (103 studies, 6000 participants) to investigate the reliability of individual confidence estimates, and of comparisons across individuals' confidence levels. Our results show that confidence estimates are more stable than their choice-accuracy counterpart, reaching a reliability plateau after roughly 50 trials, regardless of a number of task design characteristics. While constituting a reliability upper-bound for task-based confidence measures, and thereby leaving open the question of the reliability of the construct itself, these results characterize the robustness of past and future task designs.
Date: 2025-11-18
Note: View the original document on HAL open archive server: https://shs.hal.science/halshs-05371342v1
References: Add references at CitEc
Citations:
Published in Communications Psychology, 2025, 3, pp.159. ⟨10.1038/s44271-025-00330-6⟩
Downloads: (external link)
https://shs.hal.science/halshs-05371342v1/document (application/pdf)
Related works:
Working Paper: Large-scale experimental investigation of the reliability of confidence measures (2025) 
Working Paper: Large-scale experimental investigation of the reliability of confidence measures (2025) 
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:pseptp:halshs-05371342
DOI: 10.1038/s44271-025-00330-6
Access Statistics for this paper
More papers in PSE-Ecole d'économie de Paris (Postprint) from HAL
Bibliographic data for series maintained by Caroline Bauer ().