Predictive divergence in machine learning models for clinical mortality risk: A multicohort study of covid-19 patients

Magalhães, Júlia Chaves Neuenschwander; Filho, Alexandre Dias Porto Chiavegatto

Predictive divergence in machine learning models for clinical mortality risk: A multicohort study of covid-19 patients

Júlia Chaves Neuenschwander Magalhães and Alexandre Dias Porto Chiavegatto Filho

PLOS ONE, 2026, vol. 21, issue 3, 1-14

Abstract: Background: Machine learning (ML) algorithms are increasingly used in healthcare to support clinical decision-making. While models with similar overall performance are often considered interchangeable for deployment, they may produce divergent predictions, a phenomenon known as algorithmic multiplicity. In such cases, the choice of algorithm may introduce bias. This study investigates the impacts of algorithmic multiplicity in mortality prediction and assesses the influence of patient characteristics on model decisions. Methods: A cohort of 4,337 adult patients (≥18 years) with RT-PCR–confirmed covid-19 from five tertiary care hospitals in Brazil was followed from March to August 2020. Five popular ML models for structured data were trained on demographic and laboratory data collected at early hospital admission to predict in-hospital mortality. Model performance, feature importance, and algorithmic prediction similarity were evaluated. Feature distributions were compared between patients correctly or incorrectly classified by all models using paired t-tests or Mann–Whitney U tests, as applicable, at the 5% significance level. Subgroup performance differences were assessed using 10-fold cross-validation applied to five k-means–delineated clusters, compared by one-way ANOVA. Within-cluster predictive divergence was assessed within a 95% confidence interval. Results: All models achieved high overall predictive performance (µ = 0.855, σ² = 0.0072). However, the comparison of individual-level predictions revealed substantial heterogeneity, with pairwise prediction correlations ranging from R² = 0.56 to 0.80. Unsupervised k-means clustering identified five clinically distinct patient subgroups with mortality rates ranging from 22% to 80%, within which model performance varied significantly (F = 73.18, p

Date: 2026
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0344354 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 44354&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0344354

DOI: 10.1371/journal.pone.0344354

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().