Proxy Pattern-Mixture Analysis for a Binary Variable Subject to Nonresponse
Andridge Rebecca R. () and
Little Roderick J.A. ()
Additional contact information
Andridge Rebecca R.: The Ohio State University College of Public Health Division of Biostatistics, 242 Cunz Hall, 1841 Neil Ave., Columbus, OH 43210, U.S.A.
Little Roderick J.A.: University of Michigan, Department of Biostatistics, M4071 SPH II, 1415 Washington Heights, Ann Arbor, MI 48109, U.S.A.
Journal of Official Statistics, 2020, vol. 36, issue 3, 703-728
Abstract:
Given increasing survey nonresponse, good measures of the potential impact of nonresponse on survey estimates are particularly important. Existing measures, such as the R-indicator, make the strong assumption that missingness is missing at random, meaning that it depends only on variables that are observed for respondents and nonrespondents. We consider assessment of the impact of nonresponse for a binary survey variable Y subject to nonresponse when missingness may be not at random, meaning that missingness may depend on Y itself. Our work is motivated by missing categorical income data in the 2015 Ohio Medicaid Assessment Survey (OMAS), where whether or not income is missing may be related to the income value itself, with low-income earners more reluctant to respond. We assume there is a set of covariates observed for nonrespondents and respondents, which for the item nonresponse (as in OMAS) is often a rich set of variables, but which may be potentially limited in cases of unit nonresponse. To reduce dimensionality and for simplicity we reduce these available covariates to a continuous proxy variable X, available for both respondents and nonrespondents, that has the highest correlation with Y, estimated from a probit regression analysis of respondent data. We extend the previously proposed proxy-pattern mixture (PPM) analysis for continuous outcomes to the binary outcome using a latent variable approach for modeling the joint distribution of Y and X. Our method does not assume data are missing at random but includes it as a special case, thus creating a convenient framework for sensitivity analyses. Maximum likelihood, Bayesian, and multiple imputation versions of PPM analysis are described, and robustness of these methods to model assumptions is discussed. Properties are demonstrated through simulation and with the 2015 OMAS data.
Keywords: Missing data; nonignorable nonresponse; nonresponse bias; survey data; bayesian methods (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.2478/jos-2020-0035 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:vrs:offsta:v:36:y:2020:i:3:p:703-728:n:12
DOI: 10.2478/jos-2020-0035
Access Statistics for this article
Journal of Official Statistics is currently edited by Annica Isaksson and Ingegerd Jansson
More articles in Journal of Official Statistics from Sciendo
Bibliographic data for series maintained by Peter Golla ().