EconPapers    
Economics at your fingertips  
 

Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data

Ciarrah Barry, Junxi Liu, Rebecca Richmond, Martin K Rutter, Deborah A Lawlor, Frank Dudbridge and Jack Bowden

PLOS Genetics, 2021, vol. 17, issue 8, 1-26

Abstract: Over the last decade the availability of SNP-trait associations from genome-wide association studies has led to an array of methods for performing Mendelian randomization studies using only summary statistics. A common feature of these methods, besides their intuitive simplicity, is the ability to combine data from several sources, incorporate multiple variants and account for biases due to weak instruments and pleiotropy. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well developed summary data methods to individual level data, and to explore the use of more sophisticated causal methods allowing for non-linearity and effect modification.In this paper we describe a general procedure for optimally applying any two sample summary data method using one sample data. Our procedure first performs a meta-analysis of summary data estimates that are intentionally contaminated by collider bias between the genetic instruments and unmeasured confounders, due to conditioning on the observed exposure. These estimates are then used to correct the standard observational association between an exposure and outcome. Simulations are conducted to demonstrate the method’s performance against naive applications of two sample summary data MR. We apply the approach to the UK Biobank cohort to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.Our approach can be viewed as a generalization of Dudbridge et al. (Nat. Comm. 10: 1561), who developed a technique to adjust for index event bias when uncovering genetic predictors of disease progression based on case-only data. Our work serves to clarify that in any one sample MR analysis, it can be advantageous to estimate causal relationships by artificially inducing and then correcting for collider bias.Author summary: Uncovering causal mechanisms between risk factors and disease is challenging with observational data because of unobserved confounding. Mendelian randomization offers a potential solution by replacing an individual’s observed risk factor data with an unconfounded genetic proxy measure. Over the last decade an array of methods for performing Mendelian randomization studies (MR) using publicly available summary statistics gleaned from two separate genome-wide association studies. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well-developed summary data methods to individual level data. In this paper we describe a general procedure for optimally applying any summary data MR method using individual level data from one cohort study. Our approach may at first seem nonsensical: we create summary statistics that are intentionally biased by confounding. This bias can, however, be very accurately estimated, and the estimate then used to correct the results of a standard observational analysis. We apply our new way of performing an MR analysis to data from UK Biobank to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.

Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1009703 (text/html)
https://journals.plos.org/plosgenetics/article/fil ... 09703&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pgen00:1009703

DOI: 10.1371/journal.pgen.1009703

Access Statistics for this article

More articles in PLOS Genetics from Public Library of Science
Bibliographic data for series maintained by plosgenetics ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pgen00:1009703