Artificial Intelligence health advice accuracy varies across languages and contexts

Garg, Prashant; Fetzer, Thiemo

Artificial Intelligence health advice accuracy varies across languages and contexts

Prashant Garg and Thiemo Fetzer

Abstract: Using basic health statements authorized by UK and EU registers and 9,100 journalist-vetted public-health assertions on topics such as abortion, COVID-19 and politics from sources ranging from peer-reviewed journals and government advisories to social media and news across the political spectrum, we benchmark six leading large language models from in 21 languages, finding that, despite high accuracy on English-centric textbook claims, performance falls in multiple non-European languages and fluctuates by topic and source, highlighting the urgency of comprehensive multilingual, domain-aware validation before deploying AI in global health communication.

Date: 2025-04
New Economics Papers: this item is included in nep-big
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2504.18310 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2504.18310

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().