EconPapers    
Economics at your fingertips  
 

Composite Likelihood Modeling of Neighboring Site Correlations of DNA Sequence Substitution Rates

Deng Ling and Moore Dirk F.
Additional contact information
Deng Ling: Johnson & Johnson
Moore Dirk F.: University of Medicine and Dentistry of New Jersey

Statistical Applications in Genetics and Molecular Biology, 2009, vol. 8, issue 1, 22

Abstract: Sequence data from a series of homologous DNA segments from related organisms are typically polymorphic at many sites, and these polymorphisms are the result of evolutionary processes. Such data may be used to estimate the substitution rates as well as the variability of these rates. Careful characterization of the distribution of this variation is essential for accurate estimation of evolutionary distances and phylogeny reconstruction among these sequences. Many researchers have recognized the importance of the variability of substitution rates, which most have modeled using a discrete gamma distribution. Some have extended these methods to explicitly account for the correlation of substitution rates among sites using hidden Markov models; others have proposed context-dependent substitution rate schemes. We accommodate these correlations using a composite likelihood method based on a bivariate gamma distribution, which is more flexible than hidden Markov models in terms of correlation structure and more computationally tractable compared to the context-dependent schemes. We show that the estimates have good theoretical properties. We also use simulations to compare the maximum composite likelihood estimates to those obtained from maximum likelihood based on the independence assumption. We use data from the mitochondrial DNA of ten primates to obtain maximum composite likelihood estimates of the mean substitution rate, overdispersion, and correlation parameters, and use these estimates in a parametric phylogenetic bootstrap to assess the impact of serial correlation on the estimates of substitution rates and branch lengths.

Keywords: bivariate negative binomial distribution; composite likelihood; substitution rate; phylogeny; parametric bootstrap (search for similar items in EconPapers)
Date: 2009
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.2202/1544-6115.1391 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:6

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.2202/1544-6115.1391

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:6