EconPapers    
Economics at your fingertips  
 

Data Integration by Combining Big Data and Survey Sample Data for Finite Population Inference

Jae‐Kwang Kim and Siu‐Ming Tam

International Statistical Review, 2021, vol. 89, issue 2, 382-401

Abstract: The statistical challenges in using big data for making valid statistical inference in the finite population have been well documented in literature. These challenges are due primarily to statistical bias arising from under‐coverage in the big data source to represent the population of interest and measurement errors in the variables available in the data set. By stratifying the population into a big data stratum and a missing data stratum, we can estimate the missing data stratum by using a fully responding probability sample and hence the population as a whole by using a data integration estimator. By expressing the data integration estimator as a regression estimator, we can handle measurement errors in the variables in big data and also in the probability sample. We also propose a fully nonparametric classification method for identifying the overlapping units and develop a bias‐corrected data integration estimator under misclassification errors. Finally, we develop a two‐step regression data integration estimator to deal with measurement errors in the probability sample. An advantage of the approach advocated in this paper is that we do not have to make unrealistic missing‐at‐random assumptions for the methods to work. The proposed method is applied to the real data example using 2015–2016 Australian Agricultural Census data.

Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://doi.org/10.1111/insr.12434

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:istatr:v:89:y:2021:i:2:p:382-401

Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0306-7734

Access Statistics for this article

International Statistical Review is currently edited by Eugene Seneta and Kees Zeelenberg

More articles in International Statistical Review from International Statistical Institute Contact information at EDIRC.
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:istatr:v:89:y:2021:i:2:p:382-401