Implicit bias in safety-aligned large language models: A multi-faceted evaluation of clinical decision-making and health equity
Qiufeng Jia,
Yuhang Wen,
Yuyan Liu,
Hui Zhao,
Qiongge Yu,
Yu Long,
Dan Sun and
Yufeng Yu
PLOS ONE, 2026, vol. 21, issue 5, 1-18
Abstract:
Background: Large language models are increasingly integrated into healthcare for clinical decision support and patient communication. Although these models can pass explicit social bias tests, they may retain implicit biases—latent associations between social groups and attributes—that could influence medical judgment. Objective: To systematically evaluate the presence, magnitude, and behavioral impact of implicit biases in large language models within the medical domain across six high-stakes categories: gender, race, socioeconomic status, health conditions, religion, and healthcare systems. Design: A descriptive cross-sectional study using a multi-faceted evaluation framework. Setting(s): Computational analysis of 10 mainstream global large language models, including proprietary models (ChatGPT-4o, Gemini-2.0-Flash) and open-source models (DeepSeek-V3, Qwen3). Methods: We constructed 24 medical bias datasets across six categories. Bias was assessed using three methods: (1) the Large Language Model Word Association Test, a prompt-based method for revealing implicit biases; (2) the Large Language Model Relative Decision Test, a strategy for detecting subtle discrimination in situational decision-making; (3) Paired-Prompt Analysis, used to examine whether implicit associations predict discriminatory decisions. Results: All 10 models exhibited systematic implicit biases (Mean IAT Bias > 0) across all categories, with the strongest biases observed in Race (Mean = 0.61) and Socioeconomic Status (Mean = 0.56). Advanced reasoning capabilities (Chain-of-Thought) did not significantly reduce bias magnitude. Crucially, stronger implicit associations significantly predicted discriminatory choices in downstream medical decision tasks (p
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0348819 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 48819&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0348819
DOI: 10.1371/journal.pone.0348819
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().