A Toolkit for Stability Assessment of Tree-Based Learners
Michel Philipp (),
Achim Zeileis () and
Carolin Strobl ()
Working Papers from Faculty of Economics and Statistics, Universität Innsbruck
Abstract:
Recursive partitioning techniques are established and frequently applied for exploring unknown structures in complex and possibly high-dimensional data sets. The methods can be used to detect interactions and nonlinear structures in a data-driven way by recursively splitting the predictor space to form homogeneous groups of observations. However, while the resulting trees are easy to interpret, they are also known to be potentially unstable. Altering the data slightly can change either the variables and/or the cutpoints selected for splitting. Moreover, the methods do not provide measures of confidence for the selected splits and therefore users cannot assess the uncertainty of a given fitted tree. We present a toolkit of descriptive measures and graphical illustrations based on resampling, that can be used to assess the stability of the variable and cutpoint selection in recursive partitioning. The summary measures and graphics available in the toolkit are illustrated using a real world data set and implemented in the R package stablelearner.
Keywords: stability; recursive partitioning; variable selection; cutpoint selection; decision trees (search for similar items in EconPapers)
JEL-codes: C14 C45 C52 C87 (search for similar items in EconPapers)
Pages: 19 pages
Date: 2016-05
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www2.uibk.ac.at/downloads/c4041030/wpaper/2016-11.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inn:wpaper:2016-11
Access Statistics for this paper
More papers in Working Papers from Faculty of Economics and Statistics, Universität Innsbruck Contact information at EDIRC.
Bibliographic data for series maintained by Judith Courian ().