A CNN-CBAM-BIGRU model for protein function prediction

Lavkush, Sharma; Akshay, Deepak; Ashish, Ranjan; Gopalakrishnan, Krishnasamy

A CNN-CBAM-BIGRU model for protein function prediction

Sharma Lavkush (), Deepak Akshay (), Ranjan Ashish () and Krishnasamy Gopalakrishnan ()
Additional contact information
Sharma Lavkush: Department of Computer Science and Engineering, 230635 National Institute of Technology Patna , Patna, Bihar, India
Deepak Akshay: Department of Computer Science and Engineering, 230635 National Institute of Technology Patna , Patna, Bihar, India
Ranjan Ashish: Department of Computer Science and Engineering, C.V. Raman Global University, Bhubaneswar, Odisha, India
Krishnasamy Gopalakrishnan: Department of Mathematics and Computer Science, Central State University, Wilberforce, USA

Statistical Applications in Genetics and Molecular Biology, 2024, vol. 23, issue 1, 23

Abstract: Understanding a protein’s function based solely on its amino acid sequence is a crucial but intricate task in bioinformatics. Traditionally, this challenge has proven difficult. However, recent years have witnessed the rise of deep learning as a powerful tool, achieving significant success in protein function prediction. Their strength lies in their ability to automatically learn informative features from protein sequences, which can then be used to predict the protein’s function. This study builds upon these advancements by proposing a novel model: CNN-CBAM+BiGRU. It incorporates a Convolutional Block Attention Module (CBAM) alongside BiGRUs. CBAM acts as a spotlight, guiding the CNN to focus on the most informative parts of the protein data, leading to more accurate feature extraction. BiGRUs, a type of Recurrent Neural Network (RNN), excel at capturing long-range dependencies within the protein sequence, which are essential for accurate function prediction. The proposed model integrates the strengths of both CNN-CBAM and BiGRU. This study’s findings, validated through experimentation, showcase the effectiveness of this combined approach. For the human dataset, the suggested method outperforms the CNN-BIGRU+ATT model by +1.0 % for cellular components, +1.1 % for molecular functions, and +0.5 % for biological processes. For the yeast dataset, the suggested method outperforms the CNN-BIGRU+ATT model by +2.4 % for the cellular component, +1.2 % for molecular functions, and +0.6 % for biological processes.

Keywords: convolutional neural network; convolutional block attention module; gated recurrent unit; protein language models (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1515/sagmb-2024-0004 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:23:y:2024:i:1:p:23:n:1001

Ordering information: This journal article can be ordered from
https://www.degruyte ... urnal/key/sagmb/html

DOI: 10.1515/sagmb-2024-0004

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().