Functional annotation of enzyme-encoding genes using deep learning with transformer layers
Gi Bae Kim,
Ji Yeon Kim,
Jong An Lee,
Charles J. Norsigian,
Bernhard O. Palsson and
Sang Yup Lee ()
Additional contact information
Gi Bae Kim: Korea Advanced Institute of Science and Technology (KAIST)
Ji Yeon Kim: Korea Advanced Institute of Science and Technology (KAIST)
Jong An Lee: Korea Advanced Institute of Science and Technology (KAIST)
Charles J. Norsigian: University of California San Diego
Bernhard O. Palsson: University of California San Diego
Sang Yup Lee: Korea Advanced Institute of Science and Technology (KAIST)
Nature Communications, 2023, vol. 14, issue 1, 1-11
Abstract:
Abstract Functional annotation of open reading frames in microbial genomes remains substantially incomplete. Enzymes constitute the most prevalent functional gene class in microbial genomes and can be described by their specific catalytic functions using the Enzyme Commission (EC) number. Consequently, the ability to predict EC numbers could substantially reduce the number of un-annotated genes. Here we present a deep learning model, DeepECtransformer, which utilizes transformer layers as a neural network architecture to predict EC numbers. Using the extensively studied Escherichia coli K-12 MG1655 genome, DeepECtransformer predicted EC numbers for 464 un-annotated genes. We experimentally validated the enzymatic activities predicted for three proteins (YgfF, YciO, and YjdM). Further examination of the neural network’s reasoning process revealed that the trained neural network relies on functional motifs of enzymes to predict EC numbers. Thus, DeepECtransformer is a method that facilitates the functional annotation of uncharacterized genes.
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://www.nature.com/articles/s41467-023-43216-z Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-43216-z
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-023-43216-z
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().