Safety-oriented and explainable machine learning for KSI crash risk prediction: Evidence from the United Kingdom
Khanh Giang Le
PLOS ONE, 2026, vol. 21, issue 4, 1-21
Abstract:
Road traffic crashes pose a serious public safety challenge, particularly due to fatal and serious injuries. Although machine learning (ML) has been widely used for crash severity prediction, many studies remain accuracy-oriented and insufficiently address class imbalance, decision thresholds, and probabilistic reliability. This study proposes a safety-oriented and explainable ML framework for predicting killed or seriously injured (KSI) crashes using nationwide United Kingdom traffic accident data from 2020–2024. Crash severity is reformulated as a binary classification task distinguishing slight injury crashes from KSI outcomes, aligning model objectives with road safety priorities. A Light Gradient Boosting Machine (LightGBM) model is developed with imbalance handling using SMOTE, safety-oriented decision threshold optimization, and probability calibration. Model performance is evaluated using ROC–AUC, precision–recall analysis, confusion matrices, the Brier score, and a utility-based evaluation metric, while interpretability is ensured through SHapley Additive exPlanations (SHAP). Results show that default threshold settings fail to adequately detect severe crashes. At an optimized threshold of 0.35, the model achieves a Recall(KSI) of 0.605 – representing a substantial 73% improvement compared to conventional configurations – while maintaining acceptable precision. In addition, probability calibration confirms reliable risk estimation (Brier score = 0.190), supporting risk-based interpretation. Comparative analysis demonstrates that the SMOTE-based model provides a more balanced and operationally effective trade-off than class-weighted learning. SHAP analysis identifies speed limit, road class, lighting conditions, and urban context as key variables associated with KSI risk. The findings highlight the importance of safety-oriented learning design and context-aware performance interpretation for effective, risk-based traffic safety management.
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0347873 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 47873&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0347873
DOI: 10.1371/journal.pone.0347873
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().