Comparing Methods to Impute Missing Daily Ground-Level PM 10 Concentrations between 2010–2017 in South Africa
Oluwaseyi Olalekan Arowosegbe,
Martin Röösli,
Nino Künzli,
Apolline Saucy,
Temitope Christina Adebayo-Ojo,
Mohamed F. Jeebhay,
Mohammed Aqiel Dalvie and
Kees de Hoogh
Additional contact information
Oluwaseyi Olalekan Arowosegbe: Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Socinstrasse 57, CH-4002 Basel, Switzerland
Martin Röösli: Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Socinstrasse 57, CH-4002 Basel, Switzerland
Nino Künzli: Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Socinstrasse 57, CH-4002 Basel, Switzerland
Apolline Saucy: Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Socinstrasse 57, CH-4002 Basel, Switzerland
Temitope Christina Adebayo-Ojo: Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Socinstrasse 57, CH-4002 Basel, Switzerland
Mohamed F. Jeebhay: Centre for Environmental and Occupational Health Research, School of Public Health and Family Medicine, University of Cape Town, Rondebosch, 7700 Cape Town, South Africa
Mohammed Aqiel Dalvie: Centre for Environmental and Occupational Health Research, School of Public Health and Family Medicine, University of Cape Town, Rondebosch, 7700 Cape Town, South Africa
Kees de Hoogh: Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Socinstrasse 57, CH-4002 Basel, Switzerland
IJERPH, 2021, vol. 18, issue 7, 1-13
Abstract:
Good quality and completeness of ambient air quality monitoring data is central in supporting actions towards mitigating the impact of ambient air pollution. In South Africa, however, availability of continuous ground-level air pollution monitoring data is scarce and incomplete. To address this issue, we developed and compared different modeling approaches to impute missing daily average particulate matter (PM 10 ) data between 2010 and 2017 using spatiotemporal predictor variables. The random forest (RF) machine learning method was used to explore the relationship between average daily PM 10 concentrations and spatiotemporal predictors like meteorological, land use and source-related variables. National (8 models), provincial (32) and site-specific (44) RF models were developed to impute missing daily PM 10 data. The annual national, provincial and site-specific RF cross-validation (CV) models explained on average 78%, 70% and 55% of ground-level PM 10 concentrations, respectively. The spatial components of the national and provincial CV RF models explained on average 22% and 48%, while the temporal components of the national, provincial and site-specific CV RF models explained on average 78%, 68% and 57% of ground-level PM 10 concentrations, respectively. This study demonstrates a feasible approach based on RF to impute missing measurement data in areas where data collection is sparse and incomplete.
Keywords: air pollution; Random Forest; imputation; particulate matter; environmental exposure; South Africa (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://www.mdpi.com/1660-4601/18/7/3374/pdf (application/pdf)
https://www.mdpi.com/1660-4601/18/7/3374/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:18:y:2021:i:7:p:3374-:d:523488
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().