Statistical Spatial Interpretable Machine Learning in R Using Tree Ensembles and SHAP Values
Mehmet Güney Celbiş and
Louafi Bouzouina ()
Additional contact information
Mehmet Güney Celbiş: LAET - Laboratoire Aménagement Économie Transports - UL2 - Université Lumière - Lyon 2 - ENTPE - École Nationale des Travaux Publics de l'État - CNRS - Centre National de la Recherche Scientifique
Louafi Bouzouina: LAET - Laboratoire Aménagement Économie Transports - UL2 - Université Lumière - Lyon 2 - ENTPE - École Nationale des Travaux Publics de l'État - CNRS - Centre National de la Recherche Scientifique
Working Papers from HAL
Abstract:
This handbook chapter aims to present, discuss, and explore the uses of statistical machine learning algorithms and interpretable machine learning tools in the context of spatial analysis. In this regard, the chapter is mostly aimed towards researchers and practitioners in urban and regional spatial analysis, and the field of regional science in general. Using a relatively simple dataset downloadable as part of an R package, the chapter applies a series tree-based machine learning models – with XGBoost being the primary one, and analyzes the results using SHAP values. The use of spatial features, spatial cross-validation, and spatial dependence are focal topics. The use of coordinates, spatially lagged features, and their consequences on predictions are investigated by taking into account potential data leakage caused by proximities over space of data instances in calibration and validation sets. The chapter demonstrates the advantages of the used techniques for spatial analysis while highlighting the possible drawbacks of internalizing spatial information into machine learning models. In doing so, models predicting urban noise levels are employed.
Keywords: Spatial statistics; Tree-based models; Spatial cross-validation; Interpretable machine learning; SHAP values; Working Papers du LAET (search for similar items in EconPapers)
Date: 2025
Note: View the original document on HAL open archive server: https://hal.science/hal-05302147v1
References: Add references at CitEc
Citations:
Downloads: (external link)
https://hal.science/hal-05302147v1/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:wpaper:hal-05302147
Access Statistics for this paper
More papers in Working Papers from HAL
Bibliographic data for series maintained by CCSD ().