Application of ANN, XGBoost, and Other ML Methods to Forecast Air Quality in Macau
Thomas M. T. Lei (),
Stanley C. W. Ng and
Shirley W. I. Siu
Additional contact information
Thomas M. T. Lei: Institute of Science and Environment, University of Saint Joseph, Macau, China
Stanley C. W. Ng: Institute of Science and Environment, University of Saint Joseph, Macau, China
Shirley W. I. Siu: Institute of Science and Environment, University of Saint Joseph, Macau, China
Sustainability, 2023, vol. 15, issue 6, 1-17
Abstract:
Air pollution in Macau has become a serious problem following the Pearl River Delta’s (PRD) rapid industrialization that began in the 1990s. With this in mind, Macau needs an air quality forecast system that accurately predicts pollutant concentration during the occurrence of pollution episodes to warn the public ahead of time. Five different state-of-the-art machine learning (ML) algorithms were applied to create predictive models to forecast PM 2.5 , PM 10, and CO concentrations for the next 24 and 48 h, which included artificial neural networks (ANN), random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), and multiple linear regression (MLR), to determine the best ML algorithms for the respective pollutants and time scale. The diurnal measurements of air quality data in Macau from 2016 to 2021 were obtained for this work. The 2020 and 2021 datasets were used for model testing, while the four-year data before 2020 and 2021 were used to build and train the ML models. Results show that the ANN, RF, XGBoost, SVM, and MLR models were able to provide good performance in building up a 24-h forecast with a higher coefficient of determination (R 2 ) and lower root mean square error (RMSE), mean absolute error (MAE), and biases (BIAS). Meanwhile, all the ML models in the 48-h forecasting performance were satisfactory enough to be accepted as a two-day continuous forecast even if the R 2 value was lower than the 24-h forecast. The 48-h forecasting model could be further improved by proper feature selection based on the 24-h dataset, using the Shapley Additive Explanations (SHAP) value test and the adjusted R 2 value of the 48-h forecasting model. In conclusion, the above five ML algorithms were able to successfully forecast the 24 and 48 h of pollutant concentration in Macau, with the RF and SVM models performing the best in the prediction of PM 2.5 and PM 10 , and CO in both 24 and 48-h forecasts.
Keywords: air pollution; air quality; air quality forecast; machine learning; Macau (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2071-1050/15/6/5341/pdf (application/pdf)
https://www.mdpi.com/2071-1050/15/6/5341/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:15:y:2023:i:6:p:5341-:d:1100059
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().