EconPapers    
Economics at your fingertips  
 

Techniques to Improve Ecological Interpretability of Black-Box Machine Learning Models

Thomas Welchowski (), Kelly O. Maloney, Richard Mitchell and Matthias Schmid
Additional contact information
Thomas Welchowski: University of Bonn
Kelly O. Maloney: U.S. Geological Survey (USGS) Eastern Ecological Science Center at the Leetown Research Laboratory
Richard Mitchell: U.S. Environmental Protection Agency Office of Water Washington
Matthias Schmid: University of Bonn

Journal of Agricultural, Biological and Environmental Statistics, 2022, vol. 27, issue 1, No 10, 175-197

Abstract: Abstract Statistical modeling of ecological data is often faced with a large number of variables as well as possible nonlinear relationships and higher-order interaction effects. Gradient boosted trees (GBT) have been successful in addressing these issues and have shown a good predictive performance in modeling nonlinear relationships, in particular in classification settings with a categorical response variable. They also tend to be robust against outliers. However, their black-box nature makes it difficult to interpret these models. We introduce several recently developed statistical tools to the environmental research community in order to advance interpretation of these black-box models. To analyze the properties of the tools, we applied gradient boosted trees to investigate biological health of streams within the contiguous USA, as measured by a benthic macroinvertebrate biotic index. Based on these data and a simulation study, we demonstrate the advantages and limitations of partial dependence plots (PDP), individual conditional expectation (ICE) curves and accumulated local effects (ALE) in their ability to identify covariate–response relationships. Additionally, interaction effects were quantified according to interaction strength (IAS) and Friedman’s $$H^2$$ H 2 statistic. Interpretable machine learning techniques are useful tools to open the black-box of gradient boosted trees in the environmental sciences. This finding is supported by our case study on the effect of impervious surface on the benthic condition, which agrees with previous results in the literature. Overall, the most important variables were ecoregion, bed stability, watershed area, riparian vegetation and catchment slope. These variables were also present in most identified interaction effects. In conclusion, graphical tools (PDP, ICE, ALE) enable visualization and easier interpretation of GBT but should be supported by analytical statistical measures. Future methodological research is needed to investigate the properties of interaction tests. Supplementary materials accompanying this paper appear on-line.

Keywords: Boosting; Interpretable machine learning; Interaction terms; Macroinvertebrates; Stream health (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s13253-021-00479-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jagbes:v:27:y:2022:i:1:d:10.1007_s13253-021-00479-7

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/13253

DOI: 10.1007/s13253-021-00479-7

Access Statistics for this article

Journal of Agricultural, Biological and Environmental Statistics is currently edited by Stephen Buckland

More articles in Journal of Agricultural, Biological and Environmental Statistics from Springer, The International Biometric Society, American Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:jagbes:v:27:y:2022:i:1:d:10.1007_s13253-021-00479-7