Digital Mapping of Soil pH Based on Machine Learning Combined with Feature Selection Methods in East China
Zhi-Dong Zhao,
Ming-Song Zhao (),
Hong-Liang Lu,
Shi-Hang Wang and
Yuan-Yuan Lu ()
Additional contact information
Zhi-Dong Zhao: School of Geomatics, Anhui University of Science and Technology, Huainan 232001, China
Ming-Song Zhao: School of Geomatics, Anhui University of Science and Technology, Huainan 232001, China
Hong-Liang Lu: School of Geomatics, Anhui University of Science and Technology, Huainan 232001, China
Shi-Hang Wang: School of Geomatics, Anhui University of Science and Technology, Huainan 232001, China
Yuan-Yuan Lu: Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment of the People’s Republic China, Nanjing 210042, China
Sustainability, 2023, vol. 15, issue 17, 1-13
Abstract:
This study aimed to evaluate and compare the performances of the random forest (RF) and support vector regression (SVR) models combined with different feature selection methods, including recursive feature elimination (RFE), simulated annealing feature selection (SAFS), and selection by filtering (SBF) in predicting soil pH in Anhui Province, East China. We also used the ALL original features to build the RF and SVR models as a comparison. A total of 140 samples were selected, following the principles of randomness, uniformity, and representativeness, to consider the combination of landscape elements, such as topography, parent material, and land use. Auxiliary data, including climatic, topographic, and vegetation indexes, were used for predicting soil pH. The results showed that compared with the use the ALL original modeling features (ALL-RF, ALL-SVR), the combination of the three feature selection algorithms with RF and SVR can eliminate some redundant features and effectively improve the prediction accuracy of the soil pH model. For the RF model, the RMSE and the MAE of the calibration of the RFE-RF model were 0.73 and 0.57 and had the highest R 2 in four different RF models. The testing set of the RFE-RF model had an R 2 of 0.61, which was better than that of the ALL-RF (R 2 = 0.45) model and lower than those of the SAFS-RF (R 2 = 0.71) and SBF-RF (R 2 = 0.69) models. For the SVR model, the RFE-RF model was more robust and had better generalization ability. The accuracy of digital soil mapping can be improved through feature selection.
Keywords: soil pH; feature selection; random forest; support vector regression parameter tuning; multiple soil classes (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/2071-1050/15/17/12874/pdf (application/pdf)
https://www.mdpi.com/2071-1050/15/17/12874/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:15:y:2023:i:17:p:12874-:d:1225244
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().