Classification of Particulate Matter (PM2.5) Concentrations Using Feature Selection and Machine Learning Strategies
Matara Caroline Mongina (),
Nyambane Simpson Osano (),
Yusuf Amir Okeyo (),
Ochungo Elisha Akech () and
Khattak Afaq ()
Additional contact information
Matara Caroline Mongina: University of Nairobi, Department of Civil & Construction Engineering, P.O. Box 30197-00100, Nairobi, Kenya
Nyambane Simpson Osano: University of Nairobi, Department of Civil & Construction Engineering, P.O. Box 30197-00100, Nairobi, Kenya
Yusuf Amir Okeyo: University of Nairobi, Department of Chemistry, P.O. Box 30197-00100, Nairobi, Kenya
Ochungo Elisha Akech: Multimedia University, Department of Civil, Faculty of Engineering and Technology (FoET), P.O BOX 15653-00503 Nairobi, Kenya
Khattak Afaq: Tongji University, College of Transportation Engineering, 4800 Cao’an Highway, Jiading District, Shanghai 201804, China
LOGI – Scientific Journal on Transport and Logistics, 2024, vol. 15, issue 1, 85-96
Abstract:
This research employed machine learning approaches to classify acceptable or non-acceptable particulate matter (PM2.5) concentrations using a dataset that was obtained from the Nairobi expressway road corridor. The dataset contained air quality data, traffic volume, and meteorological data. The Boruta Algorithm (BA) was utilized in conjunction with the Random Forests (RF) classifier to select the most appropriate features from the dataset. The findings of the BA analysis indicated that humidity was the most influential factor in determining air quality. This was closely followed by the variables of ‘day_of_week’ and the volume of traffic bound for Nairobi. The temperature of the site was determined to have a lower significance. The comparison among different machine learning classifiers for the classification of acceptable and unacceptable PM2.5 concentrations revealed that the Extreme Gradient Boosting (XGBoost) classifier displayed superior performance in terms of Sensitivity (0.774), Specificity (0.943), F1-Score (0.833), and AU-ROC (0.874). The Binary Logistic Regression (BLR) model demonstrated comparatively poorer performance in terms of Sensitivity (0.244), Specificity (0.614), F1-Score (0.455), and AU-ROC (0.508) when compared to other ML models. The prediction of PM2.5 has the potential to provide valuable insights to transport policymakers in their deliberations on urban transport policy formulation.
Keywords: Air quality; traffic pollution; particulate matter; Boruta algorithm; machine learning (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.2478/logi-2024-0008 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:vrs:logitl:v:15:y:2024:i:1:p:85-96:n:1008
DOI: 10.2478/logi-2024-0008
Access Statistics for this article
LOGI – Scientific Journal on Transport and Logistics is currently edited by Rudolf Kampf
More articles in LOGI – Scientific Journal on Transport and Logistics from Sciendo
Bibliographic data for series maintained by Peter Golla ().