EconPapers    
Economics at your fingertips  
 

Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset

Kenneth L. Kehl (), Wenxin Xu, Alexander Gusev, Ziad Bakouny, Toni K. Choueiri, Irbaz Bin Riaz, Haitham Elmarakeby, Eliezer M. Allen and Deborah Schrag
Additional contact information
Kenneth L. Kehl: From Dana-Farber Cancer Institute
Wenxin Xu: From Dana-Farber Cancer Institute
Alexander Gusev: From Dana-Farber Cancer Institute
Ziad Bakouny: From Dana-Farber Cancer Institute
Toni K. Choueiri: From Dana-Farber Cancer Institute
Irbaz Bin Riaz: Mayo Clinic
Haitham Elmarakeby: From Dana-Farber Cancer Institute
Eliezer M. Allen: From Dana-Farber Cancer Institute
Deborah Schrag: Memorial-Sloan Kettering Cancer Center

Nature Communications, 2021, vol. 12, issue 1, 1-9

Abstract: Abstract To accelerate cancer research that correlates biomarkers with clinical endpoints, methods are needed to ascertain outcomes from electronic health records at scale. Here, we train deep natural language processing (NLP) models to extract outcomes for participants with any of 7 solid tumors in a precision oncology study. Outcomes are extracted from 305,151 imaging reports for 13,130 patients and 233,517 oncologist notes for 13,511 patients, including patients with 6 additional cancer types. NLP models recapitulate outcome annotation from these documents, including the presence of cancer, progression/worsening, response/improvement, and metastases, with excellent discrimination (AUROC > 0.90). Models generalize to cancers excluded from training and yield outcomes correlated with survival. Among patients receiving checkpoint inhibitors, we confirm that high tumor mutation burden is associated with superior progression-free survival ascertained using NLP. Here, we show that deep NLP can accelerate annotation of molecular cancer datasets with clinically meaningful endpoints to facilitate discovery.

Date: 2021
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-021-27358-6 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-27358-6

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-021-27358-6

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-27358-6