EconPapers    
Economics at your fingertips  
 

Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data

Haipeng Xing, Yifan Mo, Will Liao and Michael Q Zhang

PLOS Computational Biology, 2012, vol. 8, issue 7, 1-12

Abstract: Next-generation sequencing (NGS) technologies have matured considerably since their introduction and a focus has been placed on developing sophisticated analytical tools to deal with the amassing volumes of data. Chromatin immunoprecipitation sequencing (ChIP-seq), a major application of NGS, is a widely adopted technique for examining protein-DNA interactions and is commonly used to investigate epigenetic signatures of diffuse histone marks. These datasets have notoriously high variance and subtle levels of enrichment across large expanses, making them exceedingly difficult to define. Windows-based, heuristic models and finite-state hidden Markov models (HMMs) have been used with some success in analyzing ChIP-seq data but with lingering limitations. To improve the ability to detect broad regions of enrichment, we developed a stochastic Bayesian Change-Point (BCP) method, which addresses some of these unresolved issues. BCP makes use of recent advances in infinite-state HMMs by obtaining explicit formulas for posterior means of read densities. These posterior means can be used to categorize the genome into enriched and unenriched segments, as is customarily done, or examined for more detailed relationships since the underlying subpeaks are preserved rather than simplified into a binary classification. BCP performs a near exhaustive search of all possible change points between different posterior means at high-resolution to minimize the subjectivity of window sizes and is computationally efficient, due to a speed-up algorithm and the explicit formulas it employs. In the absence of a well-established “gold standard” for diffuse histone mark enrichment, we corroborated BCP's island detection accuracy and reproducibility using various forms of empirical evidence. We show that BCP is especially suited for analysis of diffuse histone ChIP-seq data but also effective in analyzing punctate transcription factor ChIP datasets, making it widely applicable for numerous experiment types. Author Summary: To unravel the mechanisms of gene regulation, understanding the complex interplay of protein-DNA interactions is instrumental. Recently, chromatin immunoprecipitation, coupled with next-generation sequencing (ChIP-seq), has risen as the go-to technique for examining these interactions on a genome-wide scale. It has also given rise to new computational issues. One such difficulty is the large variation in read density profiles from different types of NGS data, including variable peak “shapes” ranging from punctate to diffuse enrichment segments. To address this issue, we developed an infinite-state hidden Markov model that resulted in explicit formulas for the estimation of read density enrichment and can be used to find both significant “peaks” and broad segments. We show the versatility of BCP in analyzing various ChIP-seq data, which can further our understanding of the role of transcription factors in gene regulatory networks and histone modification marks in epigenomic modulation.

Date: 2012
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002613 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 02613&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1002613

DOI: 10.1371/journal.pcbi.1002613

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1002613