Predicting crash occurrence at intersections in Texas: an opportunity for machine learning
Theodore Charm,
Haoqi Wang,
Natalia Zuniga-Garcia,
Mostaq Ahmed and
Kara M. Kockelman
Transportation Planning and Technology, 2024, vol. 47, issue 8, 1184-1204
Abstract:
This paper studies the frequency of traffic crashes at intersections across Texas by employing Zero-inflated Negative Binomial (ZINB) and Negative Binomial-Lindley (NB-L) generalized linear models, as well as various tree-based machine learning (ML) methods, namely Random Forests (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Bayesian Additive Regression Trees (BART) to predict the frequency of crashes at intersections. Official crash reports from 2010 through 2019 were linked to Texas' over 700,000 intersections. RF provided best prediction performance (using R-square and Root Mean Square Error metrics) while serving well for highly imbalanced crash data (with many zero cases). Sensitivity analysis highlights the practical significance of signalized intersection, annual average daily traffic, number of lanes at intersection approach, and other covariates.
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/03081060.2023.2177651 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:transp:v:47:y:2024:i:8:p:1184-1204
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/GTPT20
DOI: 10.1080/03081060.2023.2177651
Access Statistics for this article
Transportation Planning and Technology is currently edited by Dr. David Gillingwater
More articles in Transportation Planning and Technology from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().