Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel
Jie Huang,
Bryan Howie,
Shane McCarthy,
Yasin Memari,
Klaudia Walter,
Josine L. Min,
Petr Danecek,
Giovanni Malerba,
Elisabetta Trabetti,
Hou-Feng Zheng,
Giovanni Gambaro,
J. Brent Richards,
Richard Durbin,
Nicholas J. Timpson,
Jonathan Marchini () and
Nicole Soranzo ()
Additional contact information
Jie Huang: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
Bryan Howie: Adaptive Biotechnologies Corporation
Shane McCarthy: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
Yasin Memari: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
Klaudia Walter: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
Josine L. Min: MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove
Petr Danecek: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
Giovanni Malerba: Biology and Genetics, University of Verona
Elisabetta Trabetti: Biology and Genetics, University of Verona
Hou-Feng Zheng: Lady Davis Institute, Jewish General Hospital
Giovanni Gambaro: Institute of Internal Medicine, Renal Program, Columbus-Gemelli University Hospital, Catholic University
J. Brent Richards: Lady Davis Institute, Jewish General Hospital
Richard Durbin: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
Nicholas J. Timpson: MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove
Jonathan Marchini: University of Oxford
Nicole Soranzo: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
Nature Communications, 2015, vol. 6, issue 1, 1-9
Abstract:
Abstract Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.
Date: 2015
References: Add references at CitEc
Citations: View citations in EconPapers (17)
Downloads: (external link)
https://www.nature.com/articles/ncomms9111 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:6:y:2015:i:1:d:10.1038_ncomms9111
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/ncomms9111
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().