EconPapers    
Economics at your fingertips  
 

Modeling the length distribution of gene conversion tracts in humans from the UK Biobank sequence data

Nobuaki Masaki and Sharon R Browning

PLOS Genetics, 2025, vol. 21, issue 11, 1-21

Abstract: Non-crossover gene conversion is a type of meiotic recombination characterized by the non-reciprocal transfer of genetic material between homologous chromosomes. Gene conversions are thought to occur within relatively short tracts of DNA. In this study, we propose a statistical method to model the length distribution of gene conversion tracts in humans, using nearly one million gene conversion tracts detected from the UK Biobank whole autosome data. To handle the large number of tracts, we designed a computationally efficient inferential framework. Our method further accounts for regional variation in the density of variant sites and heterozygosity across the genome, which can influence the observed length of gene conversion tracts. We allow for multiple candidate tract length distributions and select the best fitting distribution using the Bayesian Information Criterion (BIC). Using a mixture of two geometric components for the tract length distribution, we estimate that the smaller component has a mean of 16.9 bp (95% CI: [16.4, 17.0]), and the larger component has a mean of 724.7 bp (95% CI: [720.1, 728.7]). We further estimate the proportion of tracts from the second component to be 0.00525 (95% CI: [0.005, 0.00525]). After stratifying by crossover-hotspot overlap, we infer that tracts whose midpoints lie within crossover hotspots are, on average, longer than the remaining tracts.Author summary: Gene conversions are recombination events distinct from crossovers, in which alleles are transferred between homologous sequences within a short tract. Previous studies have investigated the lengths of gene conversion tracts using pedigree or sperm-typing data, but the number of gene conversion events that can be observed from these datasets is limited. In our study, we used almost one million detected gene conversion tracts from the ancestral history of UK Biobank participants to study the length distribution of these tracts. Within a gene conversion tract in the transmitting parent, alleles are only converted at heterozygous sites, so we cannot observe the full length of the gene conversion tract. To account for this in our method, we model the allele conversion probability separately for each detected tract. Our method allows for various distributions of the gene conversion tract length and is computationally efficient to handle all the tracts detected from the UK Biobank whole autosome data. Fitting a two-component model to shorter detected tracts that do not exceed 1.5 kb, we estimate the means of the two components to be 16.9 bp and 724.7 bp respectively. We further estimate the proportion of tracts from the second component to be 0.00525.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011951 (text/html)
https://journals.plos.org/plosgenetics/article/fil ... 11951&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pgen00:1011951

DOI: 10.1371/journal.pgen.1011951

Access Statistics for this article

More articles in PLOS Genetics from Public Library of Science
Bibliographic data for series maintained by plosgenetics ().

 
Page updated 2025-11-29
Handle: RePEc:plo:pgen00:1011951