Quantifying unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects
James Zou (),
Gregory Valiant,
Paul Valiant,
Konrad Karczewski,
Siu On Chan,
Kaitlin Samocha,
Monkol Lek,
Shamil Sunyaev,
Mark Daly and
Daniel G. MacArthur ()
Additional contact information
James Zou: Stanford University
Gregory Valiant: Stanford University
Paul Valiant: Brown University
Konrad Karczewski: Analytic and Translational Genetics Unit, Massachusetts General Hospital
Siu On Chan: Computer Science and Engineering, Chinese University of Hong Kong
Kaitlin Samocha: Analytic and Translational Genetics Unit, Massachusetts General Hospital
Monkol Lek: Analytic and Translational Genetics Unit, Massachusetts General Hospital
Shamil Sunyaev: Broad Institute or MIT and Harvard
Mark Daly: Analytic and Translational Genetics Unit, Massachusetts General Hospital
Daniel G. MacArthur: Analytic and Translational Genetics Unit, Massachusetts General Hospital
Nature Communications, 2016, vol. 7, issue 1, 1-5
Abstract:
Abstract As new proposals aim to sequence ever larger collection of humans, it is critical to have a quantitative framework to evaluate the statistical power of these projects. We developed a new algorithm, UnseenEst, and applied it to the exomes of 60,706 individuals to estimate the frequency distribution of all protein-coding variants, including rare variants that have not been observed yet in the current cohorts. Our results quantified the number of new variants that we expect to identify as sequencing cohorts reach hundreds of thousands of individuals. With 500K individuals, we find that we expect to capture 7.5% of all possible loss-of-function variants and 12% of all possible missense variants. We also estimate that 2,900 genes have loss-of-function frequency of
Date: 2016
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/ncomms13293 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:7:y:2016:i:1:d:10.1038_ncomms13293
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/ncomms13293
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().