Risks of AI scientists: prioritizing safeguarding over autonomy

Tang, Xiangru; Jin, Qiao; Zhu, Kunlun; Yuan, Tongxin; Zhang, Yichi; Zhou, Wangchunshu; Qu, Meng; Zhao, Yilun; Tang, Jian; Zhang, Zhuosheng; Cohan, Arman; Greenbaum, Dov; Lu, Zhiyong; Gerstein, Mark

Risks of AI scientists: prioritizing safeguarding over autonomy

Xiangru Tang, Qiao Jin, Kunlun Zhu, Tongxin Yuan, Yichi Zhang, Wangchunshu Zhou, Meng Qu, Yilun Zhao, Jian Tang, Zhuosheng Zhang, Arman Cohan, Dov Greenbaum, Zhiyong Lu and Mark Gerstein ()
Additional contact information
Xiangru Tang: Yale University
Qiao Jin: National Institutes of Health
Kunlun Zhu: Mila-Quebec AI Institute
Tongxin Yuan: Shanghai Jiao Tong University
Yichi Zhang: Yale University
Wangchunshu Zhou: OPPO Research Institute
Meng Qu: Mila-Quebec AI Institute
Yilun Zhao: Yale University
Jian Tang: Mila-Quebec AI Institute
Zhuosheng Zhang: Shanghai Jiao Tong University
Arman Cohan: Yale University
Dov Greenbaum: Reichman University
Zhiyong Lu: National Institutes of Health
Mark Gerstein: Yale University

Nature Communications, 2025, vol. 16, issue 1, 1-11

Abstract: Abstract AI scientists powered by large language models have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines. While their capabilities are promising, these agents also introduce novel vulnerabilities that require careful consideration for safety. However, there has been limited comprehensive exploration of these vulnerabilities. This perspective examines vulnerabilities in AI scientists, shedding light on potential risks associated with their misuse, and emphasizing the need for safety measures. We begin by providing an overview of the potential risks inherent to AI scientists, taking into account user intent, the specific scientific domain, and their potential impact on the external environment. Then, we explore the underlying causes of these vulnerabilities and provide a scoping review of the limited existing works. Based on our analysis, we propose a triadic framework involving human regulation, agent alignment, and an understanding of environmental feedback (agent regulation) to mitigate these identified risks. Furthermore, we highlight the limitations and challenges associated with safeguarding AI scientists and advocate for the development of improved models, robust benchmarks, and comprehensive regulations.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-025-63913-1 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-63913-1

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-025-63913-1

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().