Sustainable Air Quality Detection Using Sequential Forward Selection-Based ML Algorithms

Rezk, Nermeen Gamal; Alshathri, Samah; Sayed, Amged; Hemdan, Ezz El-Din; El-Behery, Heba

Sustainable Air Quality Detection Using Sequential Forward Selection-Based ML Algorithms

Nermeen Gamal Rezk, Samah Alshathri (), Amged Sayed (), Ezz El-Din Hemdan and Heba El-Behery
Additional contact information
Nermeen Gamal Rezk: Department of Computer Science and Engineering, Faculty of Engineering, Kafrelsheikh University, Kafr El Sheikh 33516, Egypt
Samah Alshathri: Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
Amged Sayed: Department of Electrical Energy Engineering, College of Engineering & Technology, Arab Academy for Science Technology & Maritime Transport, Smart Village Campus, Giza 12577, Egypt
Ezz El-Din Hemdan: Department of Computer Science and Engineering, Faculty of Electronic Engineering, Menoufia University, Menoufia 32952, Egypt
Heba El-Behery: Department of Computer Science and Engineering, Faculty of Engineering, Kafrelsheikh University, Kafr El Sheikh 33516, Egypt

Sustainability, 2024, vol. 16, issue 24, 1-17

Abstract: Air pollution has exceeded the anticipated safety limit and addressing this issue is crucial for sustainability, particularly in countries with high pollution levels. So, monitoring and forecasting air quality is essential for sustainable urban development. Therefore, this paper presents multiclass classification using two feature selection techniques, namely Sequential Forward Selection (SFS) and filtering, both with different machine learning and ensemble techniques, to predict air quality and make sure that the most relevant features are included in datasets for air quality determination. The results of the considered framework reveal that the SFS technique provides superior performance compared to filter feature selection (FFS) with different ML methods, including the AdaBoost Classifier, the Extra Tree Classifier, Random Forest (RF), and the Bagging Classifier, for efficiently determining the Air Quality Index (AQI). These models’ performances are assessed using predetermined performance metrics. The AdaBoost Classifier model with FFS has the lowest accuracy, while the RF model with SFS achieves the highest accuracy, at 78.4% and 99.99%, respectively. Based on the raw dataset, it was noted that the F1-score, recall, and precision values of the RF model with SFS are 99.96%, 99.97%, and 99.98%, respectively. Therefore, the experimental results undoubtedly show the supremacy, reliability, and robustness of the proposed approach in determining the AQI effectively.

Keywords: air quality index; machine learning; ensemble learning; sequential forward selection; feature scaling; feature encoding (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2071-1050/16/24/10835/pdf (application/pdf)
https://www.mdpi.com/2071-1050/16/24/10835/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:16:y:2024:i:24:p:10835-:d:1541345

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().