A draft UAE-based Arab pangenome reference
Nasna Nassir,
Mohamed A. Almarri,
Muhammad Kumail,
Nesrin Mohamed,
Bipin Balan,
Shehzad Hanif,
Maryam AlObathani,
Bassam Jamalalail,
Hanan Elsokary,
Dasuki Kondaramage,
Suhana Shiyas,
Noor Kosaji,
Dharana Satsangi,
Madiha Hamdi Saif Abdelmotagali,
Ahmad Abou Tayoun,
Olfat Zuhair Salem Ahmed,
Douaa Fathi Youssef,
Hanan Al Suwaidi,
Ammar Albanna,
Stefan S Du Plessis,
Hamda Hassan Khansaheb,
Alawi Alsheikh-Ali () and
Mohammed Uddin ()
Additional contact information
Nasna Nassir: Dubai Health
Mohamed A. Almarri: Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai Health
Muhammad Kumail: Dubai Health
Nesrin Mohamed: Dubai Health
Bipin Balan: Dubai Health
Shehzad Hanif: Dubai Health
Maryam AlObathani: Dubai Health
Bassam Jamalalail: Dubai Health
Hanan Elsokary: Dubai Health
Dasuki Kondaramage: Dubai Health
Suhana Shiyas: Dubai Health
Noor Kosaji: Dubai Health
Dharana Satsangi: Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai Health
Madiha Hamdi Saif Abdelmotagali: Dubai Health
Ahmad Abou Tayoun: Dubai Health
Olfat Zuhair Salem Ahmed: Dubai Health
Douaa Fathi Youssef: Dubai Health
Hanan Al Suwaidi: Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai Health
Ammar Albanna: Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai Health
Stefan S Du Plessis: Dubai Health
Hamda Hassan Khansaheb: Dubai Health
Alawi Alsheikh-Ali: Dubai Health
Mohammed Uddin: Dubai Health
Nature Communications, 2025, vol. 16, issue 1, 1-17
Abstract:
Abstract Pangenomes provide a robust and comprehensive portrayal of genetic diversity in humans, but Arab populations remain underrepresented. We present a preliminary UAE-based Arab Pangenome Reference (UPR) utilizing 53 individuals of diverse Arab ethnicities residing in the United Arab Emirates. We assembled nuclear and mitochondrial pangenomes using 35.27X high-fidelity long reads, 54.22X ultralong reads and 65.46X Hi-C reads. This approach yielded contiguous haplotype-phased de novo assemblies of exceptional quality, with an average N50 of 124.28 Mb. We discovered 111.96 million base pairs of previously uncharacterized euchromatic sequences absent from existing human pangenomes, the T2T-CHM13 and GRCh38 reference human genomes, and other public datasets. Moreover, we identified 8.94 million population-specific small variants and 235,195 structural variants within the Arab pangenome, not present in linear and pangenome references and public datasets. We detected 883 gene duplications, including the TATA-binding protein gene TAF11L5, which was uniquely duplicated across all Arab populations and that included 15.06% of genes associated with recessive diseases. By exploring the mitochondrial pangenome, we identified 1,436 bp of previously unreported sequences. Our study provides a valuable resource for future genetic research and genomic medicine initiatives in Arab population and other population with similar genetic backgrounds.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-61645-w Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-61645-w
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-61645-w
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().