Short-Term PM 2.5 Concentration Changes Prediction: A Comparison of Meteorological and Historical Data

Kang, Junfeng; Zou, Xinyi; Tan, Jianlin; Li, Jun; Karimian, Hamed

Short-Term PM 2.5 Concentration Changes Prediction: A Comparison of Meteorological and Historical Data

Junfeng Kang, Xinyi Zou, Jianlin Tan, Jun Li () and Hamed Karimian
Additional contact information
Junfeng Kang: School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Xinyi Zou: School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Jianlin Tan: School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Jun Li: Guangdong Science & Technology Infrastructure Center, Guangzhou 510033, China
Hamed Karimian: School of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China

Sustainability, 2023, vol. 15, issue 14, 1-24

Abstract: Machine learning is being extensively employed in the prediction of PM 2.5 concentrations. This study aims to compare the prediction accuracy of machine learning models for short-term PM 2.5 concentration changes and to find a universal and robust model for both hourly and daily time scales. Five commonly used machine learning models were constructed, along with a stacking model consisting of Multivariable Linear Regression (MLR) as the meta-learner and the ensemble of Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) as the base learner models. The meteorological datasets and historical PM 2.5 concentration data with meteorological datasets were preprocessed and used to evaluate the model’s accuracy and stability across different time scales, including hourly and daily, using the coefficient of determination (R 2 ), Root-Mean-Square Error (RMSE), and Mean Absolute Error (MAE). The results show that historical PM 2.5 concentration data are crucial for the prediction precision of the machine learning models. Specifically, on the meteorological datasets, the stacking model, XGboost, and RF had better performance for hourly prediction, and the stacking model, XGboost and LightGBM had better performance for daily prediction. On the historical PM 2.5 concentration data with meteorological datasets, the stacking model, LightGBM, and XGboost had better performance for hourly and daily datasets. Consequently, the stacking model outperformed individual models, with the XGBoost model being the best individual model to predict the PM 2.5 concentration based on meteorological data, and the LightGBM model being the best individual model to predict the PM 2.5 concentration using historical PM 2.5 data with meteorological datasets.

Keywords: PM 2.5 prediction; machine learning; stacking; meteorological factor (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2071-1050/15/14/11408/pdf (application/pdf)
https://www.mdpi.com/2071-1050/15/14/11408/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:15:y:2023:i:14:p:11408-:d:1200319

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().