Multiple imputation for recovering missing values when data cannot be shared
Robert Thiesmeier
Additional contact information
Robert Thiesmeier: Karolinska Institutet
Biostatistics and Epidemiology Virtual Symposium 2025 from Stata Users Group
Abstract:
Multisite studies are increasingly used to study human health across different populations and countries. However, a common challenge in using data from multiple studies is the presence of systematically missing values – when some studies have not recorded information on certain variables. Although it is possible to use data from sites with recorded observations to impute the missing values, this process becomes challenging when data pooling is not feasible because of logistic or legal constraints. We address this by introducing a framework for multiple imputation across study sites without the need of sharing individual data. In this talk, we present some motivating examples alongside a new command mi impute from that can handle the imputation of binary, discrete, and continuous variables. Given the increasing importance of multisite studies in medical and epidemiological research, mi impute from can offer a practical approach for imputing variables that have not been recorded in some study sites.
Date: 2025-03-05
References: Add references at CitEc
Citations:
Downloads: (external link)
http://repec.org/biep2025/Bio25_Thiesmeier.pdf presentation materials (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:biep25:02
Access Statistics for this paper
More papers in Biostatistics and Epidemiology Virtual Symposium 2025 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().