Performance of predictive AI-based clinical decision support systems across clinical domains: A systematic review and meta-analysis

Waldock, William J; Guni, Ahmad; Darzi, Ara; Ashrafian, Hutan

Performance of predictive AI-based clinical decision support systems across clinical domains: A systematic review and meta-analysis

William J Waldock, Ahmad Guni, Ara Darzi and Hutan Ashrafian

PLOS Digital Health, 2026, vol. 5, issue 3, 1-31

Abstract: Despite advances in deep learning and transformer architectures, prior reviews have focused narrowly on traditional clinical decision support systems (CDSS) or single medical domains, leaving significant gaps in understanding contemporary AI-driven predictive tools. This systematic review and meta-analysis evaluated the predictive performance of artificial intelligence-based CDSS (AI-CDSS) across multiple medical specialties. Following PRISMA guidelines, PubMed and Cochrane Library were searched through December 2024 for studies evaluating predictive AI-CDSS using real-world clinical data. Two reviewers independently screened 3,296 records (κ = 0.833), with study quality assessed via QUADAS-2 and performance measures pooled using random-effects meta-analysis. Fifty studies spanning 17 medical specialties were included. Meta-analysis demonstrated moderate discriminatory ability (pooled AUC: 0.652, 95% CI: 0.562–0.743), high specificity (0.819, 95% CI: 0.793–0.844), moderate accuracy (0.765, 95% CI: 0.734–0.796), and variable sensitivity (0.660, 95% CI: 0.535–0.785), with substantial heterogeneity across all measures (I² ≥ 98.9%). Only 24% of studies involved prospective deployment, and 64% reported exclusively technical metrics without clinical workflow data. Predictive AI-CDSS demonstrate moderate-to-good diagnostic performance with strong specificity; however, the predominance of retrospective study designs and limited implementation reporting reveal critical gaps between technical validation and real-world clinical utility. To address these shortcomings, we propose the ROADMAP framework, structured around seven domains: Representative development, Outcomes-focused evaluation, Assessment for deployment, Data harmonization, Monitoring for bias, Allocation via economic evaluations, and Priorities for standardized reporting and prospective validation. This framework provides a practical roadmap for bridging the gap between algorithmic performance and meaningful clinical integration.Author summary: In our study, we set out to understand how well modern Artificial Intelligence (AI) assists doctors in making clinical decisions across a wide range of medical specialties. While AI technology has advanced rapidly, we realized that previous research was often too narrow or outdated to show the full picture of these modern predictive tools.

Date: 2026
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0001310 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 01310&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0001310

DOI: 10.1371/journal.pdig.0001310

Access Statistics for this article

More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().