Predicting suicide attempt or suicide death following a visit to psychiatric specialty care: A machine learning study using Swedish national registry data

Chen, Qi; Zhang-James, Yanli; Barnett, Eric J; Lichtenstein, Paul; Jokinen, Jussi; D’Onofrio, Brian M; Faraone, Stephen V; Larsson, Henrik; Fazel, Seena

Predicting suicide attempt or suicide death following a visit to psychiatric specialty care: A machine learning study using Swedish national registry data

Qi Chen, Yanli Zhang-James, Eric J Barnett, Paul Lichtenstein, Jussi Jokinen, Brian M D’Onofrio, Stephen V Faraone, Henrik Larsson and Seena Fazel

PLOS Medicine, 2020, vol. 17, issue 11, 1-19

Abstract: Background: Suicide is a major public health concern globally. Accurately predicting suicidal behavior remains challenging. This study aimed to use machine learning approaches to examine the potential of the Swedish national registry data for prediction of suicidal behavior. Methods and findings: The study sample consisted of 541,300 inpatient and outpatient visits by 126,205 Sweden-born patients (54% female and 46% male) aged 18 to 39 (mean age at the visit: 27.3) years to psychiatric specialty care in Sweden between January 1, 2011 and December 31, 2012. The most common psychiatric diagnoses at the visit were anxiety disorders (20.0%), major depressive disorder (16.9%), and substance use disorders (13.6%). A total of 425 candidate predictors covering demographic characteristics, socioeconomic status (SES), electronic medical records, criminality, as well as family history of disease and crime were extracted from the Swedish registry data. The sample was randomly split into an 80% training set containing 433,024 visits and a 20% test set containing 108,276 visits. Models were trained separately for suicide attempt/death within 90 and 30 days following a visit using multiple machine learning algorithms. Model discrimination and calibration were both evaluated. Among all eligible visits, 3.5% (18,682) were followed by a suicide attempt/death within 90 days and 1.7% (9,099) within 30 days. The final models were based on ensemble learning that combined predictions from elastic net penalized logistic regression, random forest, gradient boosting, and a neural network. The area under the receiver operating characteristic (ROC) curves (AUCs) on the test set were 0.88 (95% confidence interval [CI] = 0.87–0.89) and 0.89 (95% CI = 0.88–0.90) for the outcome within 90 days and 30 days, respectively, both being significantly better than chance (i.e., AUC = 0.50) (p

Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1003416 (text/html)
https://journals.plos.org/plosmedicine/article?id= ... 03416&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pmed00:1003416

DOI: 10.1371/journal.pmed.1003416

Access Statistics for this article

More articles in PLOS Medicine from Public Library of Science
Bibliographic data for series maintained by plosmedicine ().