Combining Probability and Nonprobability Samples by Using Multivariate Mass Imputation Approaches with Application to Biomedical Research
Sixia Chen (),
Alexandra May Woodruff,
Janis Campbell,
Sara Vesely,
Zheng Xu and
Cuyler Snider
Additional contact information
Sixia Chen: Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, 801 NE 13th St., Oklahoma City, OK 73104, USA
Alexandra May Woodruff: Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, 801 NE 13th St., Oklahoma City, OK 73104, USA
Janis Campbell: Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, 801 NE 13th St., Oklahoma City, OK 73104, USA
Sara Vesely: Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, 801 NE 13th St., Oklahoma City, OK 73104, USA
Zheng Xu: Department of Mathematics and Statistics, Wright State University, Dayton, OH 45324, USA
Cuyler Snider: Southern Plains Tribal Health Board, 9705 Broadway Ext, Oklahoma City, OK 73114, USA
Stats, 2023, vol. 6, issue 2, 1-9
Abstract:
Nonprobability samples have been used frequently in practice including public health study, economics, education, and political polls. Naïve estimates based on nonprobability samples without any further adjustments may suffer from serious selection bias. Mass imputation has been shown to be effective in practice to improve the representativeness of nonprobability samples. It builds an imputation model based on nonprobability samples and generates imputed values for all units in the probability samples. In this paper, we compare two mass imputation approaches including latent joint multivariate normal model mass imputation (e.g., Generalized Efficient Regression-Based Imputation with Latent Processes (GERBIL)) and fully conditional specification (FCS) procedures for integrating multiple outcome variables simultaneously. The Monte Carlo simulation study shows the benefits of GERBIL and FCS with predictive mean matching in terms of balancing the Monte Carlo bias and variance. We further evaluate our proposed method by combining the information from Tribal Behavioral Risk Factor Surveillance System and Behavioral Risk Factor Surveillance System data files.
Keywords: nonprobability sample; multivariate imputation; public health data; selection bias (search for similar items in EconPapers)
JEL-codes: C1 C10 C11 C14 C15 C16 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2571-905X/6/2/39/pdf (application/pdf)
https://www.mdpi.com/2571-905X/6/2/39/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jstats:v:6:y:2023:i:2:p:39-625:d:1141547
Access Statistics for this article
Stats is currently edited by Mrs. Minnie Li
More articles in Stats from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().