EconPapers    
Economics at your fingertips  
 

Stable variable selection for right censored data: comparison of methods

Philippe Besse, Eve Leconte and Marie Walschaerts

No 12-486, TSE Working Papers from Toulouse School of Economics (TSE)

Abstract: The instability in the selection of models is a major concern with data sets containing a large number of covariates. This paper deals with variable selection methodology in the case of high-dimensional problems where the response variable can be right censored. We focuse on new stable variable selection methods based on bootstrap for two different methodologies commonly used in survival analysis: the Cox proportional hazard model and survival trees. As far as the Cox model is concerned, we investigate the bootstrapping applied to two variable selection techniques: the stepwise algorithm based on the AIC criterion and the L1-penalization of Lasso. Regarding survival trees, we review two methodologies: the bootstrap node-level stabilization and random survival forests. We apply these different approaches to two real data sets, a classical breast cancer data set and an original infertility data set. We compare the methods on two criteria: the prediction error rate based on the Harrell concordance index and the relevance of the interpretation of the corresponding selected models, focusing on the original infertility data set. The aim is to find a compromise between a good prediction performance and ease to interpretation for clinicians. Results suggest that in the case of a small number of individuals, a bootstrapping adapted to L1-penalization in the Cox model or a bootstrap node-level stabilization in survival trees give a good alternative to the random survival forest methodology, known to give the smallest prediction error rate but difficult to interprete by non-statisticians. In a clinical perspective, the complementarity between the methods based on the Cox model and those based on survival trees would permit to built reliable models easy to interprete by the clinician.

Keywords: censored data; variable selection; survival trees; survival random forests; Lasso; Cox model; bootstrap (search for similar items in EconPapers)
Date: 2012-03
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://arxiv.org/pdf/1203.4928v1.pdf Full text (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tse:wpaper:28126

Access Statistics for this paper

More papers in TSE Working Papers from Toulouse School of Economics (TSE) Contact information at EDIRC.
Bibliographic data for series maintained by ().

 
Page updated 2025-04-19
Handle: RePEc:tse:wpaper:28126