multi-GPA-Tree: Statistical approach for pleiotropy informed and functional annotation tree guided prioritization of GWAS results
Aastha Khatiwada,
Ayse Selen Yilmaz,
Bethany J Wolf,
Maciej Pietrzak and
Dongjun Chung
PLOS Computational Biology, 2023, vol. 19, issue 12, 1-27
Abstract:
Genome-wide association studies (GWAS) have successfully identified over two hundred thousand genotype-trait associations. Yet some challenges remain. First, complex traits are often associated with many single nucleotide polymorphisms (SNPs), most with small or moderate effect sizes, making them difficult to detect. Second, many complex traits share a common genetic basis due to ‘pleiotropy’ and and though few methods consider it, leveraging pleiotropy can improve statistical power to detect genotype-trait associations with weaker effect sizes. Third, currently available statistical methods are limited in explaining the functional mechanisms through which genetic variants are associated with specific or multiple traits. We propose multi-GPA-Tree to address these challenges. The multi-GPA-Tree approach can identify risk SNPs associated with single as well as multiple traits while also identifying the combinations of functional annotations that can explain the mechanisms through which risk-associated SNPs are linked with the traits. First, we implemented simulation studies to evaluate the proposed multi-GPA-Tree method and compared its performance with existing statistical approaches. The results indicate that multi-GPA-Tree outperforms existing statistical approaches in detecting risk-associated SNPs for multiple traits. Second, we applied multi-GPA-Tree to a systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA), and to a Crohn’s disease (CD) and ulcertive colitis (UC) GWAS, and functional annotation data including GenoSkyline and GenoSkylinePlus. Our results demonstrate that multi-GPA-Tree can be a powerful tool that improves association mapping while facilitating understanding of the underlying genetic architecture of complex traits and potential mechanisms linking risk-associated SNPs with complex traits.Author summary: In spite of continued success in developing statistical methodologies that integrate GWAS summary statistics and functional annotation data, existing methods are unable to pinpoint the interactions between functional annotations that influence one or more traits. Hence, the underlying interactions between biological mechanisms linking risk-associated SNPs to traits remain unknown. We propose multi-GPA-Tree to identify risk-associated SNPs and the combinations of functional annotations related to one or more trait risk-associated SNPs. Notably, multi-GPA-Tree requires only GWAS p-value summary statistics, instead of individual level genotype-phenotype data, making it more viable to implement. Compared to the existing state-of-the-art methods, multi-GPA-Tree showed improved performance in simulation studies and validated results for several auto-immune diseases in real data application. These combined results suggest that multi-GPA-Tree is an effective tool for integrative analysis and can potentially be valuable to clinical genomic researchers for hypothesis generation and validation.
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011686 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 11686&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1011686
DOI: 10.1371/journal.pcbi.1011686
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().