A Hybrid Likelihood Model for Sequence-Based Disease Association Studies

Chen, Yun-Ching; Carter, Hannah; Parla, Jennifer; Kramer, Melissa; Goes, Fernando S; Pirooznia, Mehdi; Zandi, Peter P; McCombie, W Richard; Potash, James B; Karchin, Rachel

A Hybrid Likelihood Model for Sequence-Based Disease Association Studies

Yun-Ching Chen, Hannah Carter, Jennifer Parla, Melissa Kramer, Fernando S Goes, Mehdi Pirooznia, Peter P Zandi, W Richard McCombie, James B Potash and Rachel Karchin

PLOS Genetics, 2013, vol. 9, issue 1, 1-18

Abstract: In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of statistical methods for case-control association studies has been developed to meet this challenge. A common approach to association analysis of rare variants is the burden-style collapsing methods to combine rare variant data within individuals across or within genes. Here, we propose a new hybrid likelihood model that combines a burden test with a test of the position distribution of variants. In extensive simulations and on empirical data from the Dallas Heart Study, the new model demonstrates consistently good power, in particular when applied to a gene set (e.g., multiple candidate genes with shared biological function or pathway), when rare variants cluster in key functional regions of a gene, and when protective variants are present. When applied to data from an ongoing sequencing study of bipolar disorder (191 cases, 107 controls), the model identifies seven gene sets with nominal p-values0.05, of which one MAPK signaling pathway (KEGG) reaches trend-level significance after correcting for multiple testing. Author Summary: Inexpensive, high-throughput sequencing has transformed the field of case-control association studies. For the first time, it may be possible to identify the genetic underpinnings of complex diseases, by sequencing the DNA of hundreds (even thousands) of cases and controls and comparing patterns of DNA sequence variation. However, complex diseases are likely to be caused by many variants, some of which are very rare. Taken one at a time, the association between variant and disease phenotype may not be detectable by current statistical methods. One strategy is to identify regions where important variants occur by “collapsing” variants into groups. Here, we present a new collapsing approach, capable of detecting subtle genetic differences between cases and controls. We show, in extensive simulations and using a benchmark set of genes involved in human triglyceride levels, that the approach is potentially more powerful than existing methods. We apply the new method to an ongoing sequencing study of bipolar cases and controls and identify a set of genes found in neuronal synapses, which may be implicated in bipolar disorder.

Date: 2013
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1003224 (text/html)
https://journals.plos.org/plosgenetics/article/fil ... 03224&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pgen00:1003224

DOI: 10.1371/journal.pgen.1003224

Access Statistics for this article

More articles in PLOS Genetics from Public Library of Science
Bibliographic data for series maintained by plosgenetics ().