Bayesian inference of protein conformational ensembles from limited structural data
Wojciech Potrzebowski,
Jill Trewhella and
Ingemar Andre
PLOS Computational Biology, 2018, vol. 14, issue 12, 1-27
Abstract:
Many proteins consist of folded domains connected by regions with higher flexibility. The details of the resulting conformational ensemble play a central role in controlling interactions between domains and with binding partners. Small-Angle Scattering (SAS) is well-suited to study the conformational states adopted by proteins in solution. However, analysis is complicated by the limited information content in SAS data and care must be taken to avoid constructing overly complex ensemble models and fitting to noise in the experimental data. To address these challenges, we developed a method based on Bayesian statistics that infers conformational ensembles from a structural library generated by all-atom Monte Carlo simulations. The first stage of the method involves a fast model selection based on variational Bayesian inference that maximizes the model evidence of the selected ensemble. This is followed by a complete Bayesian inference of population weights in the selected ensemble. Experiments with simulated ensembles demonstrate that model evidence is capable of identifying the correct ensemble and that correct number of ensemble members can be recovered up to high level of noise. Using experimental data, we demonstrate how the method can be extended to include data from Nuclear Magnetic Resonance (NMR) and structural energies of conformers extracted from the all-atom energy functions. We show that the data from SAXS, NMR chemical shifts and energies calculated from conformers can work synergistically to improve the definition of the conformational ensemble.Author summary: Proteins are commonly built up by folded domains connected by regions with higher flexibility. The interdomain orientations encoded by such hinges or linkers can play central roles in controlling the function of multidomain proteins, which makes them important to characterize. Small Angle X-ray Scattering (SAXS) is uniquely suited to study the conformational ensembles adopted by these kinds of proteins. However, because of the limited information provided by SAXS, ensemble models must be built by combination with other information sources and care have to be taken to avoid constructing ensembles that are more complex than data can support. We developed a method based on Bayesian statistics that combine data from molecular simulation with experimental data from SAXS and Nuclear Magnetic Resonance while automatically balancing the complexity of ensemble model with information in the data. We demonstrate that this method is capable of accurate inference of ensembles even in the presence of high levels of experimental noise. The method represents a general approach to combine data and simulation in the modeling of protein ensembles and can be extended to employ additional sources of experimental information.
Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006641 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 06641&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1006641
DOI: 10.1371/journal.pcbi.1006641
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().