EconPapers    
Economics at your fingertips  
 

Recursive partitioning on incomplete data using surrogate decisions and multiple imputation

A. Hapfelmeier, T. Hothorn and K. Ulm

Computational Statistics & Data Analysis, 2012, vol. 56, issue 6, 1552-1565

Abstract: The occurrence of missing data is a major problem in statistical data analysis. All scientific fields and data of all kinds and size are touched by this problem. There is a number of ad-hoc solutions which unfortunately lead to a loss of power, biased inference, underestimation of variability and distorted relationships between variables. A more promising approach of rising popularity is multiple imputation by chained equations (MICE) also known as imputation by full conditional specification (FCS). Alternatives to imputation are given by methods with built-in procedures. These include recursive partitioning by classification and regression trees as well as corresponding Random Forests. However there is only few literature comparing the two approaches. Existing evaluations often lack generalizability due to restrictions on data structure and simulation schemes. The application of both methods to several kinds of data and different simulation settings is meant to improve and extend the comparative analyses. Classification and regression studies are examined. Recursive partitioning is executed by two popular tree and one Random Forest implementation. Findings show that multiple imputation produces ambiguous performance results for both, simulated and real life data. Using surrogates instead is a fast and simple way to achieve performances which are only negligible worse and in many cases even superior.

Keywords: Recursive partitioning; Classification and regression trees; Random Forests; Multiple imputation; MICE; Surrogates (search for similar items in EconPapers)
Date: 2012
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (9)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947311003550
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:56:y:2012:i:6:p:1552-1565

DOI: 10.1016/j.csda.2011.09.024

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:56:y:2012:i:6:p:1552-1565