Machine learning algorithms and their predictive accuracy for suicide and self-harm: Systematic review and meta-analysis

Spittal, Matthew J; Guo, Xianglin Aneta; Kang, Laurant; Kirtley, Olivia J; Clapperton, Angela; Hawton, Keith; Kapur, Nav; Pirkis, Jane; Carter, Greg

Machine learning algorithms and their predictive accuracy for suicide and self-harm: Systematic review and meta-analysis

Matthew J Spittal, Xianglin Aneta Guo, Laurant Kang, Olivia J Kirtley, Angela Clapperton, Keith Hawton, Nav Kapur, Jane Pirkis and Greg Carter

PLOS Medicine, 2025, vol. 22, issue 9, 1-23

Abstract: Background: There has been rapid expansion in the development of machine learning algorithms to predict suicidal behaviours. To test the accuracy of these algorithms for predicting suicide and hospital-treated self-harm, we undertook a systematic review and meta-analysis. The study was registered (PROSPERO CRD42024523074). Methods and findings: We searched PubMed, PsycINFO, Scopus, EMBASE, IEEE, Medline, CINALH and Web of Science from database inception until 30 April 2025 to identify studies using machine learning algorithms to predict suicide, self-harm and a combined suicide/self-harm outcome. Studies were included if they examined suicide or hospital-treated self-harm outcomes using a case-control, case-cohort or cohort study design. Studies were excluded if they used self-reported outcomes or examined outcomes using other study designs. Accuracy was assessed using statistical methods appropriate for diagnostic accuracy studies. Fifty-three studies met the inclusion criteria. The area under the receiver operating characteristic curves ranged from 0.69 to 0.93. Sensitivity was 45%–82% and specificity was 91%–95%. Positive likelihood ratios were 6.5–9.9 and negative likelihood values were 0.2–0.6. Using in-sample prevalence values, the positive predictive values ranged from 6% to 17%. Using out-of-sample prevalence values at an LR+ value of 10, the positive predictive value was 0.1% in low prevalence populations, 17% in medium prevalence populations and 66% in high prevalence populations. The main study limitations were the exclusion of relevant studies where we could not extract sufficient information to calculate accuracy statistics and between-study differences in the follow-up time over which the outcomes were observed. Conclusions: The accuracy of machine learning algorithms for predicting suicidal behaviour is too low to be useful for screening (case finding) or for prioritising high-risk individuals for interventions (treatment allocation). For hospital-treated self-harm populations, management should instead include three components for all patients: a needs-based assessment and response, identification of modifiable risk factors with treatment intended to reduce those exposures, and implementation of demonstrated effective aftercare interventions. Author summary: Why was this study done? In a systematic review, Matthew Spittal and colleagues investigate the accuracy of machine learning algorithms to predict suicide and self-harm. They find the predictive properties of these machine learning algorithms to be poor, and no better than traditional risk assessment scales.

Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1004581 (text/html)
https://journals.plos.org/plosmedicine/article/fil ... 04581&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pmed00:1004581

DOI: 10.1371/journal.pmed.1004581

Access Statistics for this article

More articles in PLOS Medicine from Public Library of Science
Bibliographic data for series maintained by plosmedicine ().