EconPapers    
Economics at your fingertips  
 

Identification of Factors Influencing Episodes of High PM 10 Concentrations in the Air in Krakow (Poland) Using Random Forest Method

Tomasz Gorzelnik, Marek Bogacki and Robert Oleniacz ()
Additional contact information
Tomasz Gorzelnik: Department of Fundamental Research in Energy Engineering, Faculty of Energy and Fuels, AGH University of Krakow, Mickiewicza 30 Av., 30-059 Krakow, Poland
Marek Bogacki: Department of Environmental Management and Protection, Faculty of Geo-Data Science, Geodesy and Environmental Engineering, AGH University of Krakow, Mickiewicza 30 Av., 30-059 Krakow, Poland
Robert Oleniacz: Department of Environmental Management and Protection, Faculty of Geo-Data Science, Geodesy and Environmental Engineering, AGH University of Krakow, Mickiewicza 30 Av., 30-059 Krakow, Poland

Sustainability, 2024, vol. 16, issue 20, 1-23

Abstract: The episodes of elevated concentrations of different gaseous pollutants and particulate matter (PM) are of major concern worldwide, especially in city agglomerations. Krakow is an example of an urban–industrial agglomeration with constantly occurring PM 10 air limit value exceedances. In recent years, a number of legislative actions have been undertaken to improve air quality in this area. The multitude of factors affecting the emergence of cases of very high air pollutant concentrations makes it difficult to analyze them using simple statistical methods. Machine learning (ML) methods can be an adequate option, especially when proper amounts of credible data are available. The main aim of this paper was to examine the influence of various factors (including main gaseous pollutant concentrations and some meteorological factors) on the effect of high PM 10 concentration episodes in the ambient air in Krakow (Poland) using the random forest algorithm. The original methodology based on the PM 10 limit and binary classification of cases with and without the occurrence of high concentration episodes was developed. The data used were derived from routine public air quality monitoring and a local meteorological station. A range of random forest classification models with various predictor sets and for different subsets of the observations coupled with variable importance analysis were performed. The performance of the algorithm was assessed using confusion matrices. The variable importance rankings revealed, among other things, the dominant impact of the mixing layer height on elevated PM 10 concentration episode formation. This research work showed the usefulness of the random forest algorithm in identifying factors contributing to poor air quality, even in the absence of reliable emission data.

Keywords: air pollution; PM 10 episodes; meteorological factors; machine learning; random forest (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2071-1050/16/20/9015/pdf (application/pdf)
https://www.mdpi.com/2071-1050/16/20/9015/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:16:y:2024:i:20:p:9015-:d:1501249

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jsusta:v:16:y:2024:i:20:p:9015-:d:1501249