Imputation when data cannot be pooled
Nicola Orsini,
Robert Thiesmeier and
Matteo Bottai
Additional contact information
Robert Thiesmeier: Karolinska Institutet
Matteo Bottai: Karolinska Institutet
UK Stata Conference 2024 from Stata Users Group
Abstract:
Distributed data networks are increasingly used to study human health across different populations and countries. Analyses are commonly performed at each study site to avoid the transfer of individual data between study sites due to legal and logistical barriers. Despite many benefits, however, a frequent challenge in such networks is the absence of key variables of interest at one or more study sites. Current imputation methods require the availability of individual data from the involved studies to impute missing data. This creates a need for methods that can impute data in one study using only information that can be easily and freely shared within a data network. To address this need, we introduce a new Stata command, mi impute from, designed to impute missing variables in a single study using a linear predictor and the related variance/covariance matrix from an imputation model fit from one or multiple external studies. In this presentation, the syntax of mi impute from will be presented along with motivating examples from health-related research.
Date: 2024-09-16
References: Add references at CitEc
Citations:
Downloads: (external link)
http://repec.org/lsug2024/UK24_Orsini.pdf
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:lsug24:09
Access Statistics for this paper
More papers in UK Stata Conference 2024 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().