Post-hoc Evaluation of Sample Size in a Regional Digital Soil Mapping Project
Daniel D. Saurette,
Richard J. Heck,
Adam W. Gillespie,
Aaron A. Berg and
Asim Biswas ()
Additional contact information
Daniel D. Saurette: School of Environmental Sciences, University of Guelph, 50 Stone Rd East, Guelph, ON N1G 2W1, Canada
Richard J. Heck: School of Environmental Sciences, University of Guelph, 50 Stone Rd East, Guelph, ON N1G 2W1, Canada
Adam W. Gillespie: School of Environmental Sciences, University of Guelph, 50 Stone Rd East, Guelph, ON N1G 2W1, Canada
Aaron A. Berg: Department of Geography, Environment & Geomatics, University of Guelph, 50 Stone Rd East, Guelph, ON N1G 2W1, Canada
Asim Biswas: School of Environmental Sciences, University of Guelph, 50 Stone Rd East, Guelph, ON N1G 2W1, Canada
Land, 2025, vol. 14, issue 3, 1-22
Abstract:
The transition from conventional soil mapping (CSM) to digital soil mapping (DSM) not only affects the final map products, but it also affects the concepts of scale, resolution, and sampling intensity. This is critical because in the CSM approach, sampling intensity is intricately linked to the desired scale of soil map publication, which provided standardization of sampling. This is not the case for DSM where sample size varies widely by project, and sampling design studies have largely focused on where to sample without due consideration for sample size. Using a regional soil survey dataset with 1791 sampled and described soil profiles, we first extracted an external validation dataset using the conditioned Latin hypercube sampling (cLHS) algorithm and then created repeated ( n = 10) sample plans of increasing size from the remaining calibration sites using the cLHS, feature space coverage sampling (FSCS), and simple random sampling (SRS). We then trained random forest (RF) models for four soil properties: pH, CEC, clay content, and SOC at five different depths. We identified the effective sample size based on the model learning curves and compared it to the optimal sample size determined from the Jensen–Shannon divergence (D JS ) applied to the environmental covariates. Maps were then generated from models that used all the calibration points (reference maps) and from models that used the optimal sample size (optimal maps) for comparison. Our findings revealed that the optimal sample sizes based on the D JS analysis were closely aligned with the effective sample sizes from the model learning curves (815 for cLHS, 832 for FSCS, and 847 for SRS). Furthermore, the comparison of the optimal maps to the reference maps showed little difference in the global statistics (concordance correlation coefficient and root mean square error) and spatial trends of the data, confirming that the optimal sample size was sufficient for creating predictions of similar accuracy to the full calibration dataset. Finally, we conclude that the Ottawa soil survey project could have saved between CAD 330,500 and CAD 374,000 (CAD = Canadian dollars) if the determination of optimal sample size tools presented herein existed during the project planning phase. This clearly illustrates the need for additional research in determining an optimal sample size for DSM and demonstrates that operationalization of DSM in public institutions requires a sound scientific basis for determining sample size.
Keywords: sampling design; sample size; digital soil mapping; conventional soil mapping; divergence metrics; operational soil survey (search for similar items in EconPapers)
JEL-codes: Q15 Q2 Q24 Q28 Q5 R14 R52 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2073-445X/14/3/545/pdf (application/pdf)
https://www.mdpi.com/2073-445X/14/3/545/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jlands:v:14:y:2025:i:3:p:545-:d:1605846
Access Statistics for this article
Land is currently edited by Ms. Carol Ma
More articles in Land from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().