EconPapers    
Economics at your fingertips  
 

Towards a reference genome that captures global genetic diversity

Karen H. Y. Wong, Walfred Ma, Chun-Yu Wei, Erh-Chan Yeh, Wan-Jia Lin, Elin H. F. Wang, Jen-Ping Su, Feng-Jen Hsieh, Hsiao-Jung Kao, Hsiao-Huei Chen, Stephen K. Chow, Eleanor Young, Catherine Chu, Annie Poon, Chi-Fan Yang, Dar-Shong Lin, Yu-Feng Hu, Jer-Yuarn Wu, Ni-Chung Lee, Wuh-Liang Hwu, Dario Boffelli, David Martin, Ming Xiao and Pui-Yan Kwok ()
Additional contact information
Karen H. Y. Wong: Cardiovascular Research Institute, University of California, San Francisco
Walfred Ma: Cardiovascular Research Institute, University of California, San Francisco
Chun-Yu Wei: Institute of Biomedical Sciences, Academia Sinica
Erh-Chan Yeh: Institute of Biomedical Sciences, Academia Sinica
Wan-Jia Lin: Institute of Biomedical Sciences, Academia Sinica
Elin H. F. Wang: Institute of Biomedical Sciences, Academia Sinica
Jen-Ping Su: Institute of Biomedical Sciences, Academia Sinica
Feng-Jen Hsieh: Institute of Biomedical Sciences, Academia Sinica
Hsiao-Jung Kao: Institute of Biomedical Sciences, Academia Sinica
Hsiao-Huei Chen: Institute of Biomedical Sciences, Academia Sinica
Stephen K. Chow: Cardiovascular Research Institute, University of California, San Francisco
Eleanor Young: School of Biomedical Engineering, Drexel University
Catherine Chu: Institute for Human Genetics, University of California, San Francisco
Annie Poon: Institute for Human Genetics, University of California, San Francisco
Chi-Fan Yang: Institute of Biomedical Sciences, Academia Sinica
Dar-Shong Lin: Department of Pediatrics, Mackay Memorial Hospital
Yu-Feng Hu: Institute of Biomedical Sciences, Academia Sinica
Jer-Yuarn Wu: Institute of Biomedical Sciences, Academia Sinica
Ni-Chung Lee: Departments of Pediatrics and Medical Genetics, National Taiwan University Hospital
Wuh-Liang Hwu: Departments of Pediatrics and Medical Genetics, National Taiwan University Hospital
Dario Boffelli: Children’s Hospital Oakland Research Institute
David Martin: Children’s Hospital Oakland Research Institute
Ming Xiao: School of Biomedical Engineering, Drexel University
Pui-Yan Kwok: Cardiovascular Research Institute, University of California, San Francisco

Nature Communications, 2020, vol. 11, issue 1, 1-11

Abstract: Abstract The current human reference genome is predominantly derived from a single individual and it does not adequately reflect human genetic diversity. Here, we analyze 338 high-quality human assemblies of genetically divergent human populations to identify missing sequences in the human reference genome with breakpoint resolution. We identify 127,727 recurrent non-reference unique insertions spanning 18,048,877 bp, some of which disrupt exons and known regulatory elements. To improve genome annotations, we linearly integrate these sequences into the chromosomal assemblies and construct a Human Diversity Reference. Leveraging this reference, an average of 402,573 previously unmapped reads can be recovered for a given genome sequenced to ~40X coverage. Transcriptomic diversity among these non-reference sequences can also be directly assessed. We successfully map tens of thousands of previously discarded RNA-Seq reads to this reference and identify transcription evidence in 4781 gene loci, underlining the importance of these non-reference sequences in functional genomics. Our extensive datasets are important advances toward a comprehensive reference representation of global human genetic diversity.

Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-020-19311-w Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-19311-w

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-020-19311-w

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-19311-w