EconPapers    
Economics at your fingertips  
 

Pan-African genome demonstrates how population-specific genome graphs improve high-throughput sequencing data analysis

H. Serhat Tetikol (), Deniz Turgut, Kubra Narci, Gungor Budak, Ozem Kalay, Elif Arslan, Sinem Demirkaya-Budak, Alexey Dolgoborodov, Duygu Kabakci-Zorlu, Vladimir Semenyuk, Amit Jain and Brandi N. Davis-Dusenbery
Additional contact information
H. Serhat Tetikol: Seven Bridges Genomics
Deniz Turgut: Seven Bridges Genomics
Kubra Narci: Seven Bridges Genomics
Gungor Budak: Seven Bridges Genomics
Ozem Kalay: Seven Bridges Genomics
Elif Arslan: Seven Bridges Genomics
Sinem Demirkaya-Budak: Seven Bridges Genomics
Alexey Dolgoborodov: Seven Bridges Genomics
Duygu Kabakci-Zorlu: Seven Bridges Genomics
Vladimir Semenyuk: Seven Bridges Genomics
Amit Jain: Seven Bridges Genomics
Brandi N. Davis-Dusenbery: Seven Bridges Genomics

Nature Communications, 2022, vol. 13, issue 1, 1-11

Abstract: Abstract Graph-based genome reference representations have seen significant development, motivated by the inadequacy of the current human genome reference to represent the diverse genetic information from different human populations and its inability to maintain the same level of accuracy for non-European ancestries. While there have been many efforts to develop computationally efficient graph-based toolkits for NGS read alignment and variant calling, methods to curate genomic variants and subsequently construct genome graphs remain an understudied problem that inevitably determines the effectiveness of the overall bioinformatics pipeline. In this study, we discuss obstacles encountered during graph construction and propose methods for sample selection based on population diversity, graph augmentation with structural variants and resolution of graph reference ambiguity caused by information overload. Moreover, we present the case for iteratively augmenting tailored genome graphs for targeted populations and demonstrate this approach on the whole-genome samples of African ancestry. Our results show that population-specific graphs, as more representative alternatives to linear or generic graph references, can achieve significantly lower read mapping errors and enhanced variant calling sensitivity, in addition to providing the improvements of joint variant calling without the need of computationally intensive post-processing steps.

Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-022-31724-3 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-31724-3

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-022-31724-3

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-31724-3