EconPapers    
Economics at your fingertips  
 

Genome-wide profiling of highly similar paralogous genes using HiFi sequencing

Xiao Chen (), Daniel Baker, Egor Dolzhenko, Joseph M. Devaney, Jessica Noya, April S. Berlyoung, Rhonda Brandon, Kathleen S. Hruska, Lucas Lochovsky, Paul Kruszka, Scott Newman, Emily Farrow, Isabelle Thiffault, Tomi Pastinen, Dalia Kasperaviciute, Christian Gilissen, Lisenka Vissers, Alexander Hoischen, Seth Berger, Eric Vilain, Emmanuèle Délot and Michael A. Eberle ()
Additional contact information
Xiao Chen: PacBio
Daniel Baker: PacBio
Egor Dolzhenko: PacBio
Joseph M. Devaney: GeneDx
Jessica Noya: GeneDx
April S. Berlyoung: GeneDx
Rhonda Brandon: GeneDx
Kathleen S. Hruska: GeneDx
Lucas Lochovsky: GeneDx
Paul Kruszka: GeneDx
Scott Newman: GeneDx
Emily Farrow: Children’s Mercy Kansas City
Isabelle Thiffault: Children’s Mercy Kansas City
Tomi Pastinen: Children’s Mercy Kansas City
Dalia Kasperaviciute: Genomics England Ltd.
Christian Gilissen: Radboud University Medical Center
Lisenka Vissers: Radboud University Medical Center
Alexander Hoischen: Radboud University Medical Center
Seth Berger: Children’s National Hospital
Eric Vilain: University of California
Emmanuèle Délot: University of California
Michael A. Eberle: PacBio

Nature Communications, 2025, vol. 16, issue 1, 1-13

Abstract: Abstract Variant calling is hindered in segmental duplications by sequence homology. We developed Paraphase, a HiFi-based informatics method that resolves highly similar genes by phasing all haplotypes of paralogous genes together. We applied Paraphase to 160 long (>10 kb) segmental duplication regions across the human genome with high (>99%) sequence similarity, encoding 316 genes. Analysis across five ancestral populations revealed highly variable copy numbers of these regions. We identified 23 paralog groups with exceptionally low within-group diversity, where extensive gene conversion and unequal crossing over contribute to highly similar gene copies. Furthermore, our analysis of 36 trios identified 7 de novo SNVs and 4 de novo gene conversion events, 2 of which are non-allelic. Finally, we summarized extensive genetic diversity in 9 medically relevant genes previously considered challenging to genotype. Paraphase provides a framework for resolving gene paralogs, enabling accurate testing in medically relevant genes and population-wide studies of previously inaccessible genes.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-025-57505-2 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57505-2

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-025-57505-2

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-02
Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57505-2