Superpopulation model inference for non probability samples under informative sampling with high-dimensional data
Zhan Liu,
Dianni Wang and
Yingli Pan
Communications in Statistics - Theory and Methods, 2025, vol. 54, issue 5, 1370-1390
Abstract:
Non probability samples have been widely used in various fields. However, non probability samples suffer from selection biases due to the unknown selection probabilities. Superpopulation model inference methods have been discussed to solve this problem, but these approaches require the non informative sampling assumption. When the sampling mechanism is informative sampling, that is, selection probabilities are related to the outcome variable, the previous inference methods may be invalid. Moreover, we may encounter a large number of covariates in practice, which poses a new challenge for inference from non probability samples under informative sampling. In this article, the superpopulation model approaches under informative sampling with high-dimensional data are developed to perform valid inferences from non probability samples. Specifically, a semiparametric exponential tilting model is established to estimate selection probabilities, and the sample distribution is derived for estimating the superpopulation model parameters. Moreover, SCAD, adaptive LASSO, and Model-X knockoffs are employed to select variables, and estimate parameters in superpopulation modeling. Asymptotic properties of the proposed estimators are established. Results from simulation studies are presented to compare the performance of the proposed estimators with the naive estimator, which ignores informative sampling. The proposed methods are further applied to the National Health and Nutrition Examination Survey data.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/03610926.2024.2335543 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:lstaxx:v:54:y:2025:i:5:p:1370-1390
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/lsta20
DOI: 10.1080/03610926.2024.2335543
Access Statistics for this article
Communications in Statistics - Theory and Methods is currently edited by Debbie Iscoe
More articles in Communications in Statistics - Theory and Methods from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().