EconPapers    
Economics at your fingertips  
 

Grouped variable importance with random forests and application to multiple functional data analysis

Baptiste Gregorutti, Bertrand Michel and Philippe Saint-Pierre

Computational Statistics & Data Analysis, 2015, vol. 90, issue C, 15-35

Abstract: The selection of grouped variables using the random forest algorithm is considered. First a new importance measure adapted for groups of variables is proposed. Theoretical insights into this criterion are given for additive regression models. Second, an original method for selecting functional variables based on the grouped variable importance measure is developed. Using a wavelet basis, it is proposed to regroup all of the wavelet coefficients for a given functional variable and use a wrapper selection algorithm with these groups. Various other groupings which take advantage of the frequency and time localization of the wavelet basis are proposed. An extensive simulation study is performed to illustrate the use of the grouped importance measure in this context. The method is applied to a real life problem coming from aviation safety.

Keywords: Random forests; Functional data analysis; Group permutation importance measure; Group variable selection (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (12)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947315000997
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:90:y:2015:i:c:p:15-35

DOI: 10.1016/j.csda.2015.04.002

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:90:y:2015:i:c:p:15-35