Crash Severity Analysis of Highways Based on Multinomial Logistic Regression Model, Decision Tree Techniques, and Artificial Neural Network: A Modeling Comparison
Gholamreza Shiran,
Reza Imaninasab and
Razieh Khayamim
Additional contact information
Gholamreza Shiran: Faculty of Civil Engineering and Transportation, University of Isfahan, Isfahan 8174673441, Iran
Reza Imaninasab: Lyles School of Civil Engineering, Purdue University, West Lafayette, IN 47907, USA
Razieh Khayamim: Department of Transportation Engineering, Isfahan University of Technology, Isfahan 8415683111, Iran
Sustainability, 2021, vol. 13, issue 10, 1-23
Abstract:
The classification of vehicular crashes based on their severity is crucial since not all of them have the same financial and injury values. In addition, avoiding crashes by identifying their influential factors is possible via accurate prediction modeling. In crash severity analysis, accurate and time-saving prediction models are necessary for classifying crashes based on their severity. Moreover, statistical models are incapable of identifying the potential severity of crashes regarding influencing factors incorporated in models. Unlike previous research efforts, which focused on the limited class of crash severity, including property damage only (PDO), fatality, and injury by applying data mining models, the present study sought to predict crash frequency according to five severity levels of PDO, fatality, severe injury, other visible injuries, and complaint of pain. The multinomial logistic regression (MLR) model and data mining approaches, including artificial neural network-multilayer perceptron (ANN-MLP) and two decision tree techniques, (i.e., Chi-square automatic interaction detector (CHAID) and C5.0) are utilized based on traffic crash records for State Highways in California, USA. The comparison of the findings of the relative importance of ten qualitative and ten quantitative independent variables incorporated in CHAID and C5.0 indicated that the cause of the crash (X 1 ) and the number of vehicles (X 5 ) were known as the most influential variables involved in the crash. However, the cause of the crash (X 1 ) and weather (X 2 ) were identified as the most contributing variables by the ANN-MLP model. In addition, the MLR model showed that the driver’s age (X 11 ) accounts for a larger proportion of traffic crash severity. Therefore, the sensitivity analysis demonstrated that C5.0 had the best performance for predicting road crash severity. Not only did C5.0 take a shorter time (0.05 s) compared to CHAID, MLP, and MLR, it also represented the highest accuracy rate for the training set. The overall prediction accuracy based on the training data was approximately 88.09% compared to 77.21% and 70.21% for CHAID and MLP models. In general, the findings of this study revealed that C5.0 can be a promising tool for predicting road crash severity.
Keywords: crash severity; multinomial logistic regression model; decision tree techniques; artificial neural network (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (5)
Downloads: (external link)
https://www.mdpi.com/2071-1050/13/10/5670/pdf (application/pdf)
https://www.mdpi.com/2071-1050/13/10/5670/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:13:y:2021:i:10:p:5670-:d:557267
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().