From manual clinical criteria to machine learning algorithms: Comparing outcome endpoints derived from diverse electronic health record data modalities

Chappidi, Shreya; Belue, Mason J; Harmon, Stephanie A; Jagasia, Sarisha; Zhuge, Ying; Tasci, Erdal; Turkbey, Baris; Singh, Jatinder; Camphausen, Kevin; Krauze, Andra V

From manual clinical criteria to machine learning algorithms: Comparing outcome endpoints derived from diverse electronic health record data modalities

Shreya Chappidi, Mason J Belue, Stephanie A Harmon, Sarisha Jagasia, Ying Zhuge, Erdal Tasci, Baris Turkbey, Jatinder Singh, Kevin Camphausen and Andra V Krauze

PLOS Digital Health, 2025, vol. 4, issue 5, 1-29

Abstract: Background: Progression free survival (PFS) is a critical clinical outcome endpoint during cancer management and treatment evaluation. Yet, PFS is often missing from publicly available datasets due to the current subjective, expert, and time-intensive nature of generating PFS metrics. Given emerging research in multi-modal machine learning (ML), we explored the benefits and challenges associated with mining different electronic health record (EHR) data modalities and automating extraction of PFS metrics via ML algorithms. Methods: We analyzed EHR data from 92 pathology-proven GBM patients, obtaining 233 corticosteroid prescriptions, 2080 radiology reports, and 743 brain MRI scans. Three methods were developed to derive clinical PFS: 1) frequency analysis of corticosteroid prescriptions, 2) natural language processing (NLP) of reports, and 3) computer vision (CV) volumetric analysis of imaging. Outputs from these methods were compared to manually annotated clinical guideline PFS metrics. Results: Employing data-driven methods, standalone progression rates were 63% (prescription), 78% (NLP), and 54% (CV), compared to the 99% progression rate from manually applied clinical guidelines using integrated data sources. The prescription method identified progression an average of 5.2 months later than the clinical standard, while the CV and NLP algorithms identified progression earlier by 2.6 and 6.9 months, respectively. While lesion growth is a clinical guideline progression indicator, only half of patients exhibited increasing contrast-enhancing tumor volumes during scan-based CV analysis. Conclusion: Our results indicate that data-driven algorithms can extract tumor progression outcomes from existing EHR data. However, ML methods are subject to varying availability bias, supporting contextual information, and pre-processing resource burdens that influence the extracted PFS endpoint distributions. Our scan-based CV results also suggest that the automation of clinical criteria may not align with human intuition. Our findings indicate a need for improved data source integration, validation, and revisiting of clinical criteria in parallel to multi-modal ML algorithm development. Author summary: Progression free survival is an important outcome in cancer research used to evaluate new treatments. However, this data is often not publicly available as it requires labor-intensive, subjective judgement from clinicians. Different data modalities, such as text reports and imaging, stored in the electronic health record could be used to automate the extraction of progression events from a patient’s medical record. This paper explores three automated and/or machine learning (ML) methods to extract progression from integrated electronic health data, including 1) analysis of patient prescription frequencies, 2) natural language processing algorithms applied to radiology reports, and 3) computer vision tumor segmentation algorithms applied to brain MRI scans. These automated results were compared to the current manual clinical standard method of determining progression. Our study found that various ML algorithms can automate the extraction of progression outcomes from diverse patient data. Yet, manual evaluation identified progression at a higher rate compared to data-driven algorithms. Our results indicated that “ground truth” labels obtained for training ML algorithms are influenced by both the data source and method used to obtain them. Future research should consider that varying data sources, availability, and reliability can create methodological bias during ML projects.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000755 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 00755&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0000755

DOI: 10.1371/journal.pdig.0000755

Access Statistics for this article

More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().