EconPapers    
Economics at your fingertips  
 

A closer examination of three small-sample approximations to the multiple-imputation degrees of freedom

David A. Wagstaff () and Ofer Harel
Additional contact information
David A. Wagstaff: Pennsylvania State University
Ofer Harel: University of Connecticut

Stata Journal, 2011, vol. 11, issue 3, 403-419

Abstract: Incomplete data is a common complication in applied research. In this study, we use simulation to compare two approaches to the multiple imputation of a continuous predictor: multiple imputation through chained equations and multivariate normal imputation. This study extends earlier work by being the first to 1) compare the small-sample approximations to the multiple-imputation degrees of freedom proposed by Barnard and Rubin (1999, Biometrika 86: 948– 955); Lipsitz, Parzen, and Zhao (2002, Journal of Statistical Computation and Simulation 72: 309–318); and Reiter (2007, Biometrika 94: 502–508) and 2) ask if the sampling distribution of the t statistics is in fact a Student’s t distribution with the specified degrees of freedom. In addition to varying the imputation method, we varied the number of imputa- tions (m = 5,10,20,100) that were averaged over 500,000 replications to obtain the combined estimates and standard errors for a linear model that regressed the log price of a home on its age (years) and size (square feet) in a sample of 25 observations. Six age values were randomly set equal to missing for each replication. As assessed by the absolute percentage and relative percentage bias, the two approaches performed similarly. The absolute bias of the regression coefficients for age and size was roughly −0.1% across the levels of m for both approaches; the ab- solute bias for the constant was 0.6% for the chained-equations approach and 1.0% for the multivariate normal model. The absolute biases of the standard errors for age, size, and the constant were 0.2%, 0.3%, and 1.2%, respectively. In general, the relative percentage bias was slightly smaller for the chained-equations approach. Graphical and numerical inspection of the empirical sampling distributions for the three t statistics suggested that the area from the shoulder to the tail was reasonably well approximated by a t distribution and that the small-sample ap- proximations to the multiple-imputation degrees of freedom proposed by Barnard and Rubin and by Reiter performed satisfactorily. Copyright 2011 by StataCorp LP.

Keywords: missing data; multiple imputation; small-sample degrees of freedom (search for similar items in EconPapers)
Date: 2011
Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj11-3/st0235/
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://www.stata-journal.com/article.html?article=st0235 link to article purchase

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tsj:stataj:v:11:y:2011:i:3:p:403-419

Ordering information: This journal article can be ordered from
http://www.stata-journal.com/subscription.html

Access Statistics for this article

Stata Journal is currently edited by Nicholas J. Cox and Stephen P. Jenkins

More articles in Stata Journal from StataCorp LLC
Bibliographic data for series maintained by Christopher F. Baum () and Lisa Gilmore ().

 
Page updated 2025-03-20
Handle: RePEc:tsj:stataj:v:11:y:2011:i:3:p:403-419