Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
Thelma Dede Baddoo,
Zhijia Li,
Samuel Nii Odai,
Kenneth Rodolphe Chabi Boni,
Isaac Kwesi Nooni and
Samuel Ato Andam-Akorful
Additional contact information
Thelma Dede Baddoo: Binjiang College, Nanjing University of Information Science & Technology, No.333 Xishan Road, Wuxi 214105, China
Zhijia Li: College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China
Samuel Nii Odai: Office of the Vice Chancellor, Accra Technical University, Accra GA000, Ghana
Kenneth Rodolphe Chabi Boni: College of Computer and Information Engineering, Hohai University, Nanjing 211100, China
Isaac Kwesi Nooni: Binjiang College, Nanjing University of Information Science & Technology, No.333 Xishan Road, Wuxi 214105, China
Samuel Ato Andam-Akorful: Department of Geomatic Engineering, Kwame Nkrumah University of Science and Technology, Kumasi AK000, Ghana
IJERPH, 2021, vol. 18, issue 16, 1-26
Abstract:
Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity of missing data reconstruction schemes to obtain the relevant results for a real-world single station streamflow observation to facilitate its further use. This investigation was implemented by applying different missing data mechanisms spanning from univariate algorithms to multiple imputation methods accustomed to multivariate data taking time as an explicit variable. The performance accuracy of these schemes was assessed using the total error measurement (TEM) and a recommended localized error measurement (LEM) in this study. The results show that univariate missing value algorithms, which are specially developed to handle univariate time series, provide satisfactory results, but the ones which provide the best results are usually time and computationally intensive. Also, multiple imputation algorithms which consider the surrounding observed values and/or which can understand the characteristics of the data provide similar results to the univariate missing data algorithms and, in some cases, perform better without the added time and computational downsides when time is taken as an explicit variable. Furthermore, the LEM would be especially useful when the missing data are in specific portions of the dataset or where very large gaps of ‘missingness’ occur. Finally, proper handling of missing values of real-world hydroclimatic datasets depends on imputing and extensive study of the particular dataset to be imputed.
Keywords: missing data; univariate imputation; multiple imputation; SPSS; R; China (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/1660-4601/18/16/8375/pdf (application/pdf)
https://www.mdpi.com/1660-4601/18/16/8375/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:18:y:2021:i:16:p:8375-:d:610320
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().