Accurate auto-labeling of chest X-ray images based on quantitative similarity to an explainable AI model
Doyun Kim,
Joowon Chung,
Jongmun Choi,
Marc D. Succi,
John Conklin,
Maria Gabriela Figueiro Longo,
Jeanne B. Ackman,
Brent P. Little,
Milena Petranovic,
Mannudeep K. Kalra,
Michael H. Lev and
Synho Do ()
Additional contact information
Doyun Kim: Massachusetts General Brigham and Harvard Medical School
Joowon Chung: Massachusetts General Brigham and Harvard Medical School
Jongmun Choi: Massachusetts General Brigham and Harvard Medical School
Marc D. Succi: Massachusetts General Brigham and Harvard Medical School
John Conklin: Massachusetts General Brigham and Harvard Medical School
Maria Gabriela Figueiro Longo: Massachusetts General Brigham and Harvard Medical School
Jeanne B. Ackman: Massachusetts General Brigham and Harvard Medical School
Brent P. Little: Massachusetts General Brigham and Harvard Medical School
Milena Petranovic: Massachusetts General Brigham and Harvard Medical School
Mannudeep K. Kalra: Massachusetts General Brigham and Harvard Medical School
Michael H. Lev: Massachusetts General Brigham and Harvard Medical School
Synho Do: Massachusetts General Brigham and Harvard Medical School
Nature Communications, 2022, vol. 13, issue 1, 1-15
Abstract:
Abstract The inability to accurately, efficiently label large, open-access medical imaging datasets limits the widespread implementation of artificial intelligence models in healthcare. There have been few attempts, however, to automate the annotation of such public databases; one approach, for example, focused on labor-intensive, manual labeling of subsets of these datasets to be used to train new models. In this study, we describe a method for standardized, automated labeling based on similarity to a previously validated, explainable AI (xAI) model-derived-atlas, for which the user can specify a quantitative threshold for a desired level of accuracy (the probability-of-similarity, pSim metric). We show that our xAI model, by calculating the pSim values for each clinical output label based on comparison to its training-set derived reference atlas, can automatically label the external datasets to a user-selected, high level of accuracy, equaling or exceeding that of human experts. We additionally show that, by fine-tuning the original model using the automatically labelled exams for retraining, performance can be preserved or improved, resulting in a highly accurate, more generalized model.
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-022-29437-8 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-29437-8
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-022-29437-8
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().