EconPapers    
Economics at your fingertips  
 

A New Order Estimator for Fixed and Variable Length Markov Models with Applications to DNA Sequence Similarity

Dalevi Daniel, Dubhashi Devdatt and Hermansson Malte
Additional contact information
Dalevi Daniel: Chalmers University of Technology
Dubhashi Devdatt: Chalmers University of Technology
Hermansson Malte: Gothenburg University

Statistical Applications in Genetics and Molecular Biology, 2006, vol. 5, issue 1, 26

Abstract: Recently Peres and Shields discovered a new method for estimating the order of a stationary fixed order Markov chain. They showed that the estimator is consistent by proving a threshold result. While this threshold is valid asymptotically in the limit, it is not very useful for DNA sequence analysis where data sizes are moderate. In this paper we give a novel interpretation of the Peres-Shields estimator as a sharp transition phenomenon. This yields a precise and powerful estimator that quickly identifies the core dependencies in data. We show that it compares favorably to other estimators, especially in the presence of variable dependencies. Motivated by this last point, we extend the Peres-Shields estimator to Variable Length Markov Chains. We compare it to a well-established estimator and show that it is superior in terms of the predictive likelihood. We give an application to the problem of detecting DNA sequence similarity in plasmids.

Keywords: computational biology/bioinformatics; statistical models; statistical theory and methods (search for similar items in EconPapers)
Date: 2006
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://doi.org/10.2202/1544-6115.1214 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:5:y:2006:i:1:n:8

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.2202/1544-6115.1214

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:5:y:2006:i:1:n:8