EconPapers    
Economics at your fingertips  
 

CpG Island Identification with Higher Order and Variable Order Markov Models

Zhenqiu Liu (), Dechang Chen () and Xue-wen Chen ()
Additional contact information
Zhenqiu Liu: TATRC
Dechang Chen: Uniformed Services University of the Health Sciences
Xue-wen Chen: The University of Kansas

A chapter in Data Mining in Biomedicine, 2007, pp 47-57 from Springer

Abstract: Abstract Identifying the location and function of human genes in a long sequence of genome is difficult due to lack of sufficient information about genes. Experimental evidence has suggested that there exists strong correlation between CpG islands and genes immediately following them. Much research has been done to identify CpG islands in a DNA sequence using various models. In this chapter, we introduce two alternative models based on high order and variable order Markov chains. Compared with the popular models such as the first order Markov chain, HMM, and HMT, these two models are much easier to compute and have higher identification accuracies. One unsolved problem with the Markov model is that there is no way to decide the exact boundary point between CpG and non-CpG islands. In this chapter, we provide a novel tool to decide the boundary points using the sequential probability test. Sequential data from GeneBank are used for the experiments in this chapter.

Keywords: DNA sequences; CpG islands; Markov models; Probability Suffix Trees (PST); sequential probability ratio test (SPRT) (search for similar items in EconPapers)
Date: 2007
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:spochp:978-0-387-69319-4_4

Ordering information: This item can be ordered from
http://www.springer.com/9780387693194

DOI: 10.1007/978-0-387-69319-4_4

Access Statistics for this chapter

More chapters in Springer Optimization and Its Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-01
Handle: RePEc:spr:spochp:978-0-387-69319-4_4