Accurate Detection of Recombinant Breakpoints in Whole-Genome Alignments
Oscar Westesson and
Ian Holmes
PLOS Computational Biology, 2009, vol. 5, issue 3, 1-13
Abstract:
We propose a novel method for detecting sites of molecular recombination in multiple alignments. Our approach is a compromise between previous extremes of computationally prohibitive but mathematically rigorous methods and imprecise heuristic methods. Using a combined algorithm for estimating tree structure and hidden Markov model parameters, our program detects changes in phylogenetic tree topology over a multiple sequence alignment. We evaluate our method on benchmark datasets from previous studies on two recombinant pathogens, Neisseria and HIV-1, as well as simulated data. We show that we are not only able to detect recombinant regions of vastly different sizes but also the location of breakpoints with great accuracy. We show that our method does well inferring recombination breakpoints while at the same time maintaining practicality for larger datasets. In all cases, we confirm the breakpoint predictions of previous studies, and in many cases we offer novel predictions.Author Summary: In viral and bacterial pathogens, recombination has the ability to combine fitness-enhancing mutations. Accurate characterization of recombinant breakpoints in newly sequenced strains can provide information about the role of this process in evolution, for example, in immune evasion. Of particular interest are situations of an admixture of pathogen subspecies, recombination between whose genomes may change the apparent phylogenetic tree topology in different regions of a multiple-genome alignment. We describe an algorithm that can pinpoint recombination breakpoints to greater accuracy than previous methods, allowing detection of both short recombinant regions and long-range multiple crossovers. The algorithm is appropriate for the analysis of fast-evolving pathogen sequences where repeated substitutions may be observed at a single site in a multiple alignment (violating the “infinite sites” assumption inherent to some other breakpoint-detection algorithms). Simulations demonstrate the practicality of our implementation for alignments of longer sequences and more taxa than previous methods.
Date: 2009
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000318 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 00318&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1000318
DOI: 10.1371/journal.pcbi.1000318
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().