EconPapers    
Economics at your fingertips  
 

Prediction of Daily Mean PM 10 Concentrations Using Random Forest, CART Ensemble and Bagging Stacked by MARS

Snezhana Gocheva-Ilieva, Atanas Ivanov and Maya Stoimenova-Minova
Additional contact information
Snezhana Gocheva-Ilieva: Department of Mathematical Analysis, Faculty of Mathematics and Informatics, Paisii Hilendarski University of Plovdiv, 4000 Plovdiv, Bulgaria
Atanas Ivanov: Department of Mathematical Analysis, Faculty of Mathematics and Informatics, Paisii Hilendarski University of Plovdiv, 4000 Plovdiv, Bulgaria
Maya Stoimenova-Minova: Department of Mathematical Analysis, Faculty of Mathematics and Informatics, Paisii Hilendarski University of Plovdiv, 4000 Plovdiv, Bulgaria

Sustainability, 2022, vol. 14, issue 2, 1-26

Abstract: A novel framework for stacked regression based on machine learning was developed to predict the daily average concentrations of particulate matter (PM 10 ), one of Bulgaria’s primary health concerns. The measurements of nine meteorological parameters were introduced as independent variables. The goal was to carefully study a limited number of initial predictors and extract stochastic information from them to build an extended set of data that allowed the creation of highly efficient predictive models. Four base models using random forest, CART ensemble and bagging, and their rotation variants, were built and evaluated. The heterogeneity of these base models was achieved by introducing five types of diversities, including a new simplified selective ensemble algorithm. The predictions from the four base models were then used as predictors in multivariate adaptive regression splines (MARS) models. All models were statistically tested using out-of-bag or with 5-fold and 10-fold cross-validation. In addition, a variable importance analysis was conducted. The proposed framework was used for short-term forecasting of out-of-sample data for seven days. It was shown that the stacked models outperformed all single base models. An index of agreement IA = 0.986 and a coefficient of determination of about 95% were achieved.

Keywords: air pollution; machine learning; stacking; rotation ensemble; bagging; selective ensemble; diversity strategy (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.mdpi.com/2071-1050/14/2/798/pdf (application/pdf)
https://www.mdpi.com/2071-1050/14/2/798/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:14:y:2022:i:2:p:798-:d:722504

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jsusta:v:14:y:2022:i:2:p:798-:d:722504