Protein Sectors: Statistical Coupling Analysis versus Conservation
Tiberiu Teşileanu,
Lucy J Colwell and
Stanislas Leibler
PLOS Computational Biology, 2015, vol. 11, issue 2, 1-20
Abstract:
Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation.Author Summary: Statistical analyses of alignments of evolutionarily related protein sequences have been proposed as a method for obtaining information about protein structure and function. One such method, called statistical coupling analysis, identifies patterns of correlated mutations and uses them to find groups of coevolving residues. These groups, called protein sectors, have been reported to be relevant for various functional aspects, such as enzymatic efficiency, protein stability, or allostery. Here, we reanalyze existing data in order to assess the relative importance of two factors contributing to statistical coupling analysis, namely single-site amino acid frequencies and pairwise correlations. Although correlations have been shown to be informative in other studies, we point out that in existing large-scale data that has been analyzed with statistical coupling analysis, single-site statistics seems to be a dominating factor.
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004091 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 04091&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1004091
DOI: 10.1371/journal.pcbi.1004091
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().