EconPapers    
Economics at your fingertips  
 

Estimating Motifs Under Order Restrictions

W van Zwet Erik, Kechris Katherina J, Bickel Peter J and Eisen Michael B.
Additional contact information
W van Zwet Erik: Mathematical Institute, Leiden University
Kechris Katherina J: Department of Biochemistry and Biophysics, University of California, San Francisco
Bickel Peter J: Department of Statistics, University of California, Berkeley
Eisen Michael B.: Department of Molecular and Cell Biology, University of California, Berkeley; Life Sciences Division, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley

Statistical Applications in Genetics and Molecular Biology, 2005, vol. 4, issue 1, 18

Abstract: Transcription factors and many other DNA-binding proteins recognize more than one specific sequence. Among sequences recognized by a given DNA-binding protein, different positions exhibit varying degrees of conservation. The reason is that base pairs that are more extensively contacted by the protein tend to be more conserved. This observation can be used in the discovery of transcription factor binding sites. Here we present a rigorous means to accomplish this. In particular, we constrain the order of the information (entropy) in the columns of the position specific weight matrix (PWM) which characterizes the motif being sought. We then show how to compute the maximum likelihood estimate of a PWM under such order restrictions. This computation is easily integrated with the EM algorithm or the Gibbs sampler to enhance performance in the search for motifs in unaligned sequences. We demonstrate our method on a well-known data set of binding sites of the transcription factor Crp in E. coli.

Date: 2005
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.2202/1544-6115.1100 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:4:y:2005:i:1:n:1

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.2202/1544-6115.1100

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:4:y:2005:i:1:n:1