Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables
Mariyam Khan,
Adriaan-Alexander Ludl,
Sean Bankier,
Johan L M Björkegren and
Tom Michoel
PLOS Genetics, 2024, vol. 20, issue 11, 1-29
Abstract:
Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium), which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for PHACTR1 and ADAMTS7 in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with cis-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.Author summary: Although genome-wide association studies have mapped thousands of genetic variants that explain the heritable nature of many complex traits and diseases, the causal genes and mechanisms underlying these associations are often unclear. This is partly due to the widespread presence of “regulatory pleiotropy”, a phenomenon where the same genetic variants affect gene expression of multiple genes in the same genomic locus across multiple tissues. Mendelian randomization is a statistical method that uses genetic variants as instrumental variables to estimate causal effects of exposures on outcomes. Here we have extended this technique to the situation where multiple exposures can have a simultaneous effect on an outcome, and no independent instrumental variables are available for each exposure. When applied to a dataset of genetic and gene expression variation in seven vascular and metabolic tissues of 600 individuals undergoing heart surgery, our method identified candidate causal genes and tissues for coronary artery disease risk at genomic positions where regulatory pleiotropy and the extensive correlations between genetic variants made the application of existing Mendelian randomization methods infeasible. Further support for the validity of our method to identify causal genes using sets of correlated instrumental variables was provided by extensive simulations and theoretical results.
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011473 (text/html)
https://journals.plos.org/plosgenetics/article/fil ... 11473&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pgen00:1011473
DOI: 10.1371/journal.pgen.1011473
Access Statistics for this article
More articles in PLOS Genetics from Public Library of Science
Bibliographic data for series maintained by plosgenetics ().