EconPapers    
Economics at your fingertips  
 

A multimodal generative AI copilot for human pathology

Ming Y. Lu, Bowen Chen, Drew F. K. Williamson, Richard J. Chen, Melissa Zhao, Aaron K. Chow, Kenji Ikemura, Ahrong Kim, Dimitra Pouli, Ankush Patel, Amr Soliman, Chengkuan Chen, Tong Ding, Judy J. Wang, Georg Gerber, Ivy Liang, Long Phi Le, Anil V. Parwani, Luca L. Weishaupt and Faisal Mahmood ()
Additional contact information
Ming Y. Lu: Harvard Medical School
Bowen Chen: Harvard Medical School
Drew F. K. Williamson: Harvard Medical School
Richard J. Chen: Harvard Medical School
Melissa Zhao: Harvard Medical School
Aaron K. Chow: Ohio State University
Kenji Ikemura: Harvard Medical School
Ahrong Kim: Harvard Medical School
Dimitra Pouli: Harvard Medical School
Ankush Patel: Mayo Clinic
Amr Soliman: Ohio State University
Chengkuan Chen: Harvard Medical School
Tong Ding: Harvard Medical School
Judy J. Wang: Harvard Medical School
Georg Gerber: Harvard Medical School
Ivy Liang: Harvard Medical School
Long Phi Le: Harvard Medical School
Anil V. Parwani: Ohio State University
Luca L. Weishaupt: Harvard Medical School
Faisal Mahmood: Harvard Medical School

Nature, 2024, vol. 634, issue 8033, 466-473

Abstract: Abstract Computational pathology1,2 has witnessed considerable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders3,4. However, despite the explosive growth of generative artificial intelligence (AI), there have been few studies on building general-purpose multimodal AI assistants and copilots5 tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology. We built PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and fine-tuning the whole system on over 456,000 diverse visual-language instructions consisting of 999,202 question and answer turns. We compare PathChat with several multimodal vision-language AI assistants and GPT-4V, which powers the commercially available multimodal general-purpose AI assistant ChatGPT-4 (ref. 6). PathChat achieved state-of-the-art performance on multiple-choice diagnostic questions from cases with diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive vision-language AI copilot that can flexibly handle both visual and natural language inputs, PathChat may potentially find impactful applications in pathology education, research and human-in-the-loop clinical decision-making.

Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41586-024-07618-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:634:y:2024:i:8033:d:10.1038_s41586-024-07618-3

Ordering information: This journal article can be ordered from
https://www.nature.com/

DOI: 10.1038/s41586-024-07618-3

Access Statistics for this article

Nature is currently edited by Magdalena Skipper

More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:nature:v:634:y:2024:i:8033:d:10.1038_s41586-024-07618-3