Predicting Heavy Metal Concentrations in Shallow Aquifer Systems Based on Low-Cost Physiochemical Parameters Using Machine Learning Techniques
Thi-Minh-Trang Huynh,
Chuen-Fa Ni (),
Yu-Sheng Su (),
Vo-Chau-Ngan Nguyen,
I-Hsien Lee,
Chi-Ping Lin and
Hoang-Hiep Nguyen
Additional contact information
Thi-Minh-Trang Huynh: Graduate Institute of Applied Geology, National Central University, Taoyuan 32001, Taiwan
Chuen-Fa Ni: Graduate Institute of Applied Geology, National Central University, Taoyuan 32001, Taiwan
Yu-Sheng Su: Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung 202301, Taiwan
Vo-Chau-Ngan Nguyen: College of Environment and Natural Resources, Can Tho University, Can Tho 94000, Vietnam
I-Hsien Lee: Graduate Institute of Applied Geology, National Central University, Taoyuan 32001, Taiwan
Chi-Ping Lin: Graduate Institute of Applied Geology, National Central University, Taoyuan 32001, Taiwan
Hoang-Hiep Nguyen: Graduate Institute of Applied Geology, National Central University, Taoyuan 32001, Taiwan
IJERPH, 2022, vol. 19, issue 19, 1-21
Abstract:
Monitoring ex-situ water parameters, namely heavy metals, needs time and laboratory work for water sampling and analytical processes, which can retard the response to ongoing pollution events. Previous studies have successfully applied fast modeling techniques such as artificial intelligence algorithms to predict heavy metals. However, neither low-cost feature predictability nor explainability assessments have been considered in the modeling process. This study proposes a reliable and explainable framework to find an effective model and feature set to predict heavy metals in groundwater. The integrated assessment framework has four steps: model selection uncertainty, feature selection uncertainty, predictive uncertainty, and model interpretability. The results show that Random Forest is the most suitable model, and quick-measure parameters can be used as predictors for arsenic (As), iron (Fe), and manganese (Mn). Although the model performance is auspicious, it likely produces significant uncertainties. The findings also demonstrate that arsenic is related to nutrients and spatial distribution, while Fe and Mn are affected by spatial distribution and salinity. Some limitations and suggestions are also discussed to improve the prediction accuracy and interpretability.
Keywords: Random Forest; heavy metals; groundwater quality; explainable artificial intelligence (XAI); prediction intervals (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1660-4601/19/19/12180/pdf (application/pdf)
https://www.mdpi.com/1660-4601/19/19/12180/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:19:y:2022:i:19:p:12180-:d:925319
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().