EconPapers    
Economics at your fingertips  
 

Unbiased variable importance for random forests

Markus Loecher

Communications in Statistics - Theory and Methods, 2022, vol. 51, issue 5, 1413-1425

Abstract: The default variable-importance measure in random forests, Gini importance, has been shown to suffer from the bias of the underlying Gini-gain splitting criterion. While the alternative permutation importance is generally accepted as a reliable measure of variable importance, it is also computationally demanding and suffers from other shortcomings. We propose a simple solution to the misleading/untrustworthy Gini importance which can be viewed as an over-fitting problem: we compute the loss reduction on the out-of-bag instead of the in-bag training samples.

Date: 2022
References: Add references at CitEc
Citations: View citations in EconPapers (5)

Downloads: (external link)
http://hdl.handle.net/10.1080/03610926.2020.1764042 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:lstaxx:v:51:y:2022:i:5:p:1413-1425

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/lsta20

DOI: 10.1080/03610926.2020.1764042

Access Statistics for this article

Communications in Statistics - Theory and Methods is currently edited by Debbie Iscoe

More articles in Communications in Statistics - Theory and Methods from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:lstaxx:v:51:y:2022:i:5:p:1413-1425