EconPapers    
Economics at your fingertips  
 

Sparse Markov Chains for Sequence Data

Väinö Jääskinen, Jie Xiong, Jukka Corander and Timo Koski

Scandinavian Journal of Statistics, 2014, vol. 41, issue 3, 639-655

Abstract: type="main" xml:id="sjos12053-abs-0001"> Finite memory sources and variable-length Markov chains have recently gained popularity in data compression and mining, in particular, for applications in bioinformatics and language modelling. Here, we consider denser data compression and prediction with a family of sparse Bayesian predictive models for Markov chains in finite state spaces. Our approach lumps transition probabilities into classes composed of invariant probabilities, such that the resulting models need not have a hierarchical structure as in context tree-based approaches. This can lead to a substantially higher rate of data compression, and such non-hierarchical sparse models can be motivated for instance by data dependence structures existing in the bioinformatics context. We describe a Bayesian inference algorithm for learning sparse Markov models through clustering of transition probabilities. Experiments with DNA sequence and protein data show that our approach is competitive in both prediction and classification when compared with several alternative methods on the basis of variable memory length.

Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://hdl.handle.net/10.1111/sjos.12053 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:scjsta:v:41:y:2014:i:3:p:639-655

Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0303-6898

Access Statistics for this article

Scandinavian Journal of Statistics is currently edited by ÿrnulf Borgan and Bo Lindqvist

More articles in Scandinavian Journal of Statistics from Danish Society for Theoretical Statistics, Finnish Statistical Society, Norwegian Statistical Association, Swedish Statistical Association
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:scjsta:v:41:y:2014:i:3:p:639-655