Wiggle—Predicting Functionally Flexible Regions from Primary Sequence
Jenny Gu,
Michael Gribskov and
Philip E Bourne
PLOS Computational Biology, 2006, vol. 2, issue 7, 1-17
Abstract:
The Wiggle series are support vector machine–based predictors that identify regions of functional flexibility using only protein sequence information. Functionally flexible regions are defined as regions that can adopt different conformational states and are assumed to be necessary for bioactivity. Many advances have been made in understanding the relationship between protein sequence and structure. This work contributes to those efforts by making strides to understand the relationship between protein sequence and flexibility. A coarse-grained protein dynamic modeling approach was used to generate the dataset required for support vector machine training. We define our regions of interest based on the participation of residues in correlated large-scale fluctuations. Even with this structure-based approach to computationally define regions of functional flexibility, predictors successfully extract sequence-flexibility relationships that have been experimentally confirmed to be functionally important. Thus, a sequence-based tool to identify flexible regions important for protein function has been created. The ability to identify functional flexibility using a sequence based approach complements structure-based definitions and will be especially useful for the large majority of proteins with unknown structures. The methodology offers promise to identify structural genomics targets amenable to crystallization and the possibility to engineer more flexible or rigid regions within proteins to modify their bioactivity.Synopsis: Proteins are not static entities in biology and are constantly changing their shape and form to perform their necessary biological roles. While we are intuitively aware of their constantly changing nature, we have little understanding of how their flexibility is encoded in the protein sequence. To address this knowledge gap, predictors were created to identify sequence patterns that dictate local regions to be flexible and serve a functional purpose. By combining protein dynamic modeling and machine learning techniques, the Wiggle predictor series were able to generalize the sequence-flexibility relationship for all proteins. With these predictors we are able to identify flexible regions of functional importance such as hinges, recognition loops, and catalytic loops using only sequence information. This work has important contributions to our understanding of the sequence-flexibility relationship and paves the road to identifying local sequence modulations that impact protein function without necessarily changing the structure.
Date: 2006
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.0020090 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 20090&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:0020090
DOI: 10.1371/journal.pcbi.0020090
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().