Composite Likelihood Modeling of Neighboring Site Correlations of DNA Sequence Substitution Rates
Deng Ling and
Moore Dirk F.
Additional contact information
Deng Ling: Johnson & Johnson
Moore Dirk F.: University of Medicine and Dentistry of New Jersey
Statistical Applications in Genetics and Molecular Biology, 2009, vol. 8, issue 1, 22
Abstract:
Sequence data from a series of homologous DNA segments from related organisms are typically polymorphic at many sites, and these polymorphisms are the result of evolutionary processes. Such data may be used to estimate the substitution rates as well as the variability of these rates. Careful characterization of the distribution of this variation is essential for accurate estimation of evolutionary distances and phylogeny reconstruction among these sequences. Many researchers have recognized the importance of the variability of substitution rates, which most have modeled using a discrete gamma distribution. Some have extended these methods to explicitly account for the correlation of substitution rates among sites using hidden Markov models; others have proposed context-dependent substitution rate schemes. We accommodate these correlations using a composite likelihood method based on a bivariate gamma distribution, which is more flexible than hidden Markov models in terms of correlation structure and more computationally tractable compared to the context-dependent schemes. We show that the estimates have good theoretical properties. We also use simulations to compare the maximum composite likelihood estimates to those obtained from maximum likelihood based on the independence assumption. We use data from the mitochondrial DNA of ten primates to obtain maximum composite likelihood estimates of the mean substitution rate, overdispersion, and correlation parameters, and use these estimates in a parametric phylogenetic bootstrap to assess the impact of serial correlation on the estimates of substitution rates and branch lengths.
Keywords: bivariate negative binomial distribution; composite likelihood; substitution rate; phylogeny; parametric bootstrap (search for similar items in EconPapers)
Date: 2009
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.2202/1544-6115.1391 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:6
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.2202/1544-6115.1391
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().