EconPapers    
Economics at your fingertips  
 

Optimal machine learning techniques for meteorological modeling of $${\textrm{PM}}_{2.5}$$ PM 2.5 concentration in five major polluted cities of South-East Asia

Sedra Shafi and Nicola Scafetta ()
Additional contact information
Sedra Shafi: University of Naples Federico II, Complesso Universitario di Monte S. Angelo
Nicola Scafetta: University of Naples Federico II, Complesso Universitario di Monte S. Angelo

Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, 2025, vol. 121, issue 6, No 25, 7025 pages

Abstract: Abstract The rapid decline in air quality across Southeast and Western Pacific Asia is occurring at an accelerated pace due to population growth and industrial development. The region’s Meteorological factors, including the monsoon seasonality, exert a significant influence on air pollution levels, particularly $${\textrm{PM}}_{2.5}$$ PM 2.5 concentrations. In this study, we employ a statistical modeling approach to derive daily $${\textrm{PM}}_{2.5}$$ PM 2.5 levels from meteorological parameters in five major polluted cities: Lahore (Pakistan), Delhi (India), Dhaka (Bangladesh), Hanoi (Vietnam), and Shanghai (China). The incorporated meteorological parameters are wind speed, barometric pressure, temperature, and rainfall, which are known to affect air pollution levels from 2020 to 2022. The statistical modeling was based on the comparative analysis of 35 different machine learning (ML) regression techniques with the purpose of selecting the algorithms most efficient for reconstructing and predicting $${\textrm{PM}}_{2.5}$$ PM 2.5 levels from meteorological variables alone. Specifically, each ML regression model was trained to reconstruct daily $${\textrm{PM}}_{2.5}$$ PM 2.5 levels in 2020–2021, and then used to reconstruct both missing daily $${\textrm{PM}}_{2.5}$$ PM 2.5 levels in 2020–2021 and forecast the whole of 2022 using only the 2022 meteorological records. The results indicated that most of the daily and seasonal variability in daily $${\textrm{PM}}_{2.5}$$ PM 2.5 levels could be reconstructed from meteorological conditions. However, the performance of the various ML models (as assessed by Root Mean Square Error tests) exhibited considerable variability. Among the tested models, the Ensembles Boosted Tree ML method demonstrated optimal efficiency during the training period (the first 2 years, 2020 and 2021) and it also was highly efficient in predicting the third year (2022) using only meteorological data. Additionaly, the Trilayer Neural Network ML method was found the most effective at reconstructing the data after 3 years of training and may therefore be preferred to fill in short periods of missing $${\textrm{PM}}_{2.5}$$ PM 2.5 data. In contrast, our comparative analyses showed that the traditional multi-linear regression models under-performed in both constructing and predicting $${\textrm{PM}}_{2.5}$$ PM 2.5 data. This study demonstrates the necessity and usefulness of assessing multiple ML regression methodologies for selecting which ones better perform for reconstructing the data of interest (in our case $${\textrm{PM}}_{2.5}$$ PM 2.5 records) from their hypothesized constructors (in our case meteorological parameters). In particular, this study has highlighted the utility of using ML regression techniques for forecasting air quality and reconstructing missing pollution data, which is crucial for policy-making across South-East and Western-Pacific Asia regions, where only limited pollution monitoring infrastructure are available.

Keywords: Air pollution ( $${\textrm{PM}}_{2.5}$$ PM 2.5 ) assessment; Machine learning regression models; Pollution ( $${\textrm{PM}}_{2.5}$$ PM 2.5 ); Meteorological monsoon conditions; South-East and Western-Pacific Asia (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11069-024-07077-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:nathaz:v:121:y:2025:i:6:d:10.1007_s11069-024-07077-z

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11069

DOI: 10.1007/s11069-024-07077-z

Access Statistics for this article

Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards is currently edited by Thomas Glade, Tad S. Murty and Vladimír Schenk

More articles in Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards from Springer, International Society for the Prevention and Mitigation of Natural Hazards
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-05-18
Handle: RePEc:spr:nathaz:v:121:y:2025:i:6:d:10.1007_s11069-024-07077-z