Simulation of "forward-backward" multiple-imputation technique in longitudinal clinical dataset
Catherine Welch,
Irene Petersen and
James Carpenter
Additional contact information
Catherine Welch: Department of Primary Care & Population Health, University College London
Irene Petersen: Department of Primary Care & Population Health, University College London
James Carpenter: Medical Statistics Unit, London School of Hygiene and Tropical Medicine
United Kingdom Stata Users' Group Meetings 2010 from Stata Users Group
Abstract:
Most standard missing-data techniques have been designed for cross-sectional data. A "forward-backward" multiple-imputation algorithm has been developed to impute missing values in longitudinal data (Nevalainen, Kenward, and Virtanen, 2009, Statistics in Medicine 28: 36577-3669) This technique will be applied to The Health Improvement Network (THIN), a longitudinal primary-care database to impute variables associated with incidence of cardiovascular disease (CVD). A sample of 483 patients was extracted from THIN to test the performance of the algorithm before it was applied to the whole dataset. This dataset included individuals with information available on age, sex, deprivation quintile, height, weight, systolic blood pressure, and total serum cholesterol for each age from 65 to 69 years. CVD was identified if the patient was diagnosed with one of a predefined list of conditions at any of these ages. They were then considered to have CVD at each subsequent age. In this sample, measurements of weight, systolic blood pressure, and cholesterol were replaced with missing values such that the probability that data are missing decreases as age increases; i.e., the data are missing at random and the overall percentage of missing data is equivalent to that in THIN. We then applied the forward-backward algorithm, which imputes values at each time point by using measurements before and after the one of interest and updates values sequentially. Ten complete datasets were created. A Poisson regression was performed using data in each dataset, and estimates were combined using Rubin's rules. These steps were repeated 200 times and the coefficients were averaged. I will explain in more detail how the forward-backward algorithm works and also will demonstrate the results following multiple imputation using this algorithm. I will compare these results with the analysis before data were replaced with missing values and a complete case analysis to assess the performance of the algorithm.
Date: 2010-09-17
References: Add references at CitEc
Citations:
Downloads: (external link)
http://repec.org/usug2010/UKSUG10.Welch.ppt (application/x-mspowerpoint)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:usug10:04
Access Statistics for this paper
More papers in United Kingdom Stata Users' Group Meetings 2010 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().