EconPapers    
Economics at your fingertips  
 

An alternative pruning based approach to unbiased recursive partitioning

Alberto Alvarez-Iglesias, John Hinde, John Ferguson and John Newell

Computational Statistics & Data Analysis, 2017, vol. 106, issue C, 90-102

Abstract: Tree-based methods are a non-parametric modelling strategy that can be used in combination with generalized linear models or Cox proportional hazards models, mostly at an exploratory stage. Their popularity is mainly due to the simplicity of the technique along with the ease in which the resulting model can be interpreted. Variable selection bias from variables with many possible splits or missing values has been identified as one of the problems associated with tree-based methods. A number of unbiased recursive partitioning algorithms have been proposed that avoid this bias by using p-values in the splitting procedure of the algorithm. The final tree is obtained using direct stopping rules (pre-pruning strategy) or by growing a large tree first and pruning it afterwards (post-pruning). Some of the drawbacks of pre-pruned trees based on p-values in the presence of interaction effects and a large number of explanatory variables are discussed, and a simple alternative post-pruning solution is presented that allows the identification of such interactions. The proposed method includes a novel pruning algorithm that uses a false discovery rate (FDR) controlling procedure for the determination of splits corresponding to significant tests. The new approach is demonstrated with simulated and real-life examples.

Keywords: Tree-based methods; Interactions; Pruning; False discovery rate (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S016794731630192X
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:106:y:2017:i:c:p:90-102

DOI: 10.1016/j.csda.2016.08.011

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:106:y:2017:i:c:p:90-102