EconPapers    
Economics at your fingertips  
 

Interspecific comparison of gene expression profiles using machine learning

Artem S Kasianov, Anna V Klepikova, Alexey V Mayorov, Gleb S Buzanov, Maria D Logacheva and Aleksey A Penin

PLOS Computational Biology, 2023, vol. 19, issue 1, 1-20

Abstract: Interspecific gene comparisons are the keystones for many areas of biological research and are especially important for the translation of knowledge from model organisms to economically important species. Currently they are hampered by the low resolution of methods based on sequence analysis and by the complex evolutionary history of eukaryotic genes. This is especially critical for plants, whose genomes are shaped by multiple whole genome duplications and subsequent gene loss. This requires the development of new methods for comparing the functions of genes in different species. Here, we report ISEEML (Interspecific Similarity of Expression Evaluated using Machine Learning)–a novel machine learning-based algorithm for interspecific gene classification. In contrast to previous studies focused on sequence similarity, our algorithm focuses on functional similarity inferred from the comparison of gene expression profiles. We propose novel metrics for expression pattern similarity–expression score (ES)–that is suitable for species with differing morphologies. As a proof of concept, we compare detailed transcriptome maps of Arabidopsis thaliana, the model species, Zea mays (maize) and Fagopyrum esculentum (common buckwheat), which are species that represent distant clades within flowering plants. The classifier resulted in an AUC of 0.91; under the ES threshold of 0.5, the specificity was 94%, and sensitivity was 72%.Author summary: Interspecific gene comparisons are keystone for many areas of biological research, being especially important for the translation of knowledge from model organisms to economically important species. Currently, they are based on the concept of orthology–the orthologs are assumed to have similar functions (the so called Ortholog Conjecture). This approach is problematic for two reasons: 1) the universal applicability of Ortholog Conjecture is arguable 2) the accuracy of orthology inference is complicated due to multiple whole genome duplications and subsequent gene loss–the typical processes for most eukaryotic organisms. We report a novel machine-learning-based algorithm for the interspecific gene comparison. In contrast to previous studies, which focus on sequence similarity, it focuses on the similarity of function at the organismic level approximated by the expression patterns. As source of information for the classification, we use detailed gene expression maps. Our study for the first time proposes a metrics for comparison of expression maps suitable for species with differing morphologies and/or developmental rates. Without this, the comparisons of expression maps were possible only either for closely related species with similar morphology or for very low resolution maps. In contrast, our approach is suitable for the wide range of organisms with no limitations of their morphology and the resolution of expression maps.

Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010743 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 10743&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1010743

DOI: 10.1371/journal.pcbi.1010743

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-05-03
Handle: RePEc:plo:pcbi00:1010743