Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs

Altenhoff, Adrian M; Studer, Romain A; Robinson-Rechavi, Marc; Dessimoz, Christophe

Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs

Adrian M Altenhoff, Romain A Studer, Marc Robinson-Rechavi and Christophe Dessimoz

PLOS Computational Biology, 2012, vol. 8, issue 5, 1-10

Abstract: The function of most proteins is not determined experimentally, but is extrapolated from homologs. According to the “ortholog conjecture”, or standard model of phylogenomics, protein function changes rapidly after duplication, leading to paralogs with different functions, while orthologs retain the ancestral function. We report here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports this model. We show that to analyze GO annotation effectively, several confounding factors need to be controlled: authorship bias, variation of GO term frequency among species, variation of background similarity among species pairs, and propagated annotation bias. After controlling for these biases, we observe that orthologs have generally more similar functional annotations than paralogs. This is especially strong for sub-cellular localization. We observe only a weak decrease in functional similarity with increasing sequence divergence. These findings hold over a large diversity of species; notably orthologs from model organisms such as E. coli, yeast or mouse have conserved function with human proteins. Author Summary: To infer the function of an unknown gene, possibly the most effective way is to identify a well-characterized evolutionarily related gene, and assume that they have both kept their ancestral function. If several such homologs are available, all else being equal, it has long been assumed that those that diverged by speciation (“ortholog”) are functionally closer than those that diverged by duplication (“paralogs”); thus function is more reliably inferred from the former. But despite its prevalence, this model mostly rests on first principles, as for the longest time we have not had sufficient data to test it empirically. Recently, some studies began investigating this question and have cast doubt on the validity of this model. Here, we show that by considering a wide range of organisms and data, and, crucially, by correcting for several easily overlooked biases affecting functional annotations, the standard model is corroborated by the presently available experimental data.

Date: 2012
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002514 (text/html)
https://journals.plos.org/ploscompbiol/article?id= ... 02514&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1002514

DOI: 10.1371/journal.pcbi.1002514

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().