Large-scale experimental investigation of the reliability of confidence measures

Bouleau, Clémentine; Lebreton, Mael; Jacquemet, Nicolas

Large-scale experimental investigation of the reliability of confidence measures

Clémentine Bouleau (), Mael Lebreton and Nicolas Jacquemet ()
Additional contact information
Clémentine Bouleau: PSE - Paris School of Economics - UP1 - Université Paris 1 Panthéon-Sorbonne - ENS-PSL - École normale supérieure - Paris - PSL - Université Paris Sciences et Lettres - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement - ENPC - École nationale des ponts et chaussées - IP Paris - Institut Polytechnique de Paris, CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique

PSE-Ecole d'économie de Paris (Postprint) from HAL

Abstract: Whether individuals feel confident about their own actions, choices, or statements being correct, and how these confidence levels differ between individuals are two key primitives for countless behavioral theories and phenomena. In cognitive tasks, individual confidence is typically measured as the average of reports about choice accuracy, but how reliable is the resulting characterization of within- and between-individual confidence remains surprisingly undocumented. Here, we perform a large-scale resampling exercise in the Confidence Database (103 studies, 6000 participants) to investigate the reliability of individual confidence estimates, and of comparisons across individuals' confidence levels. Our results show that confidence estimates are more stable than their choice-accuracy counterpart, reaching a reliability plateau after roughly 50 trials, regardless of a number of task design characteristics. While constituting a reliability upper-bound for task-based confidence measures, and thereby leaving open the question of the reliability of the construct itself, these results characterize the robustness of past and future task designs.

Date: 2025-11-18
Note: View the original document on HAL open archive server: https://shs.hal.science/halshs-05371342v1
References: Add references at CitEc
Citations:

Published in Communications Psychology, 2025, 3, pp.159. ⟨10.1038/s44271-025-00330-6⟩

Downloads: (external link)
https://shs.hal.science/halshs-05371342v1/document (application/pdf)

Related works:
Working Paper: Large-scale experimental investigation of the reliability of confidence measures (2025)
Working Paper: Large-scale experimental investigation of the reliability of confidence measures (2025)
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hal:pseptp:halshs-05371342

DOI: 10.1038/s44271-025-00330-6

Access Statistics for this paper

More papers in PSE-Ecole d'économie de Paris (Postprint) from HAL
Bibliographic data for series maintained by Caroline Bauer ().