A maximum-entropy model for predicting chromatin contacts
Pau Farré and
Eldon Emberly
PLOS Computational Biology, 2018, vol. 14, issue 2, 1-16
Abstract:
The packaging of DNA inside a nucleus shows complex structure stabilized by a host of DNA-bound factors. Both the distribution of these factors and the contacts between different genomic locations of the DNA can now be measured on a genome-wide scale. This has advanced the development of models aimed at predicting the conformation of DNA given only the locations of bound factors—the chromatin folding problem. Here we present a maximum-entropy model that is able to predict a contact map representation of structure given a sequence of bound factors. Non-local effects due to the sequence neighborhood around contacting sites are found to be important for making accurate predictions. Lastly, we show that the model can be used to infer a sequence of bound factors given only a measurement of structure. This opens up the possibility for efficiently predicting sequence regions that may play a role in generating cell-type specific structural differences.Author summary: The three-dimensional folding of DNA inside the nucleus into specific conformations is necessary for the proper functioning of cells. These structures can be measured by chromosome conformation capture methods (Hi-C) that report the number of times that each pair of genomic sites are found in proximal location in a cell population experiment. A number of protein complexes that bind to the DNA have been discovered to be responsible for the stabilization of such conformations. However, identifying the precise relation between the positioning of binding proteins and the resulting structures is still an open problem. Here we present a maximum-entropy method able to predict Hi-C contact probabilities from a sequence of binding factors without the need of performing any polymer simulations. We envision that this method will allow experimentalists to efficiently calculate the expected structural effect of altering the sequence of binding factors. In addition, we also show that our model is capable of solving the inverse problem, namely predicting the underlying sequence of binding factors from a set of observed contact probabilities.
Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005956 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 05956&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1005956
DOI: 10.1371/journal.pcbi.1005956
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().