Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous
Bryan Kolaczkowski and
Joseph W. Thornton ()
Additional contact information
Bryan Kolaczkowski: Department of Computer and Information Science
Joseph W. Thornton: Center for Ecology and Evolutionary Biology, University of Oregon
Nature, 2004, vol. 431, issue 7011, 980-984
Abstract:
Abstract All inferences in comparative biology depend on accurate estimates of evolutionary relationships. Recent phylogenetic analyses have turned away from maximum parsimony towards the probabilistic techniques of maximum likelihood and bayesian Markov chain Monte Carlo (BMCMC). These probabilistic techniques represent a parametric approach to statistical phylogenetics, because their criterion for evaluating a topology—the probability of the data, given the tree—is calculated with reference to an explicit evolutionary model from which the data are assumed to be identically distributed. Maximum parsimony can be considered nonparametric, because trees are evaluated on the basis of a general metric—the minimum number of character state changes required to generate the data on a given tree—without assuming a specific distribution1. The shift to parametric methods was spurred, in large part, by studies showing that although both approaches perform well most of the time2, maximum parsimony is strongly biased towards recovering an incorrect tree under certain combinations of branch lengths, whereas maximum likelihood is not3,4,5,6. All these evaluations simulated sequences by a largely homogeneous evolutionary process in which data are identically distributed. There is ample evidence, however, that real-world gene sequences evolve heterogeneously and are not identically distributed7,8,9,10,11,12,13,14,15,16. Here we show that maximum likelihood and BMCMC can become strongly biased and statistically inconsistent when the rates at which sequence sites evolve change non-identically over time. Maximum parsimony performs substantially better than current parametric methods over a wide range of conditions tested, including moderate heterogeneity and phylogenetic problems not normally considered difficult.
Date: 2004
References: Add references at CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
https://www.nature.com/articles/nature02917 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:431:y:2004:i:7011:d:10.1038_nature02917
Ordering information: This journal article can be ordered from
https://www.nature.com/
DOI: 10.1038/nature02917
Access Statistics for this article
Nature is currently edited by Magdalena Skipper
More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().