EconPapers    
Economics at your fingertips  
 

A diploid assembly-based benchmark for variants in the major histocompatibility complex

Chen-Shan Chin, Justin Wagner, Qiandong Zeng, Erik Garrison, Shilpa Garg, Arkarachai Fungtammasan, Mikko Rautiainen, Sergey Aganezov, Melanie Kirsche, Samantha Zarate, Michael C. Schatz, Chunlin Xiao, William J. Rowell, Charles Markello, Jesse Farek, Fritz J. Sedlazeck, Vikas Bansal, Byunggil Yoo, Neil Miller, Xin Zhou, Andrew Carroll, Alvaro Martinez Barrio, Marc Salit, Tobias Marschall, Alexander T. Dilthey and Justin M. Zook ()
Additional contact information
Chen-Shan Chin: DNAnexus, Inc, 1975 W El Camino Real
Justin Wagner: Material Measurement Laboratory, National Institute of Standards and Technology
Qiandong Zeng: Laboratory Corporation of America Holdings
Erik Garrison: University of California, Santa Cruz
Shilpa Garg: Harvard Medical School
Arkarachai Fungtammasan: DNAnexus, Inc, 1975 W El Camino Real
Mikko Rautiainen: Center for Bioinformatics, Saarland University, Saarland Informatics Campus E2.1
Sergey Aganezov: Johns Hopkins University
Melanie Kirsche: Johns Hopkins University
Samantha Zarate: Johns Hopkins University
Michael C. Schatz: Johns Hopkins University
Chunlin Xiao: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health
William J. Rowell: Pacific Biosciences
Charles Markello: University of California, Santa Cruz
Jesse Farek: Human Genome Sequencing Center, Baylor College of Medicine
Fritz J. Sedlazeck: Human Genome Sequencing Center, Baylor College of Medicine
Vikas Bansal: University of California San Diego
Byunggil Yoo: Genomic Medicine Center, Children’s Mercy Kansas City
Neil Miller: Genomic Medicine Center, Children’s Mercy Kansas City
Xin Zhou: Stanford University
Andrew Carroll: Google Inc, 1600 Amphitheatre Pkwy
Alvaro Martinez Barrio: 10x Genomics
Marc Salit: Joint Initiative for Metrology in Biology
Tobias Marschall: Institute of Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf
Alexander T. Dilthey: Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf
Justin M. Zook: Material Measurement Laboratory, National Institute of Standards and Technology

Nature Communications, 2020, vol. 11, issue 1, 1-9

Abstract: Abstract Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks.

Date: 2020
References: Add references at CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.nature.com/articles/s41467-020-18564-9 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-18564-9

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-020-18564-9

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-18564-9