Structure-based prediction of nucleic acid binding residues by merging deep learning- and template-based approaches

Jiang, Zheng; Shen, Yue-Yue; Liu, Rong

Structure-based prediction of nucleic acid binding residues by merging deep learning- and template-based approaches

Zheng Jiang, Yue-Yue Shen and Rong Liu

PLOS Computational Biology, 2023, vol. 19, issue 9, 1-24

Abstract: Accurate prediction of nucleic binding residues is essential for the understanding of transcription and translation processes. Integration of feature- and template-based strategies could improve the prediction of these key residues in proteins. Nevertheless, traditional hybrid algorithms have been surpassed by recently developed deep learning-based methods, and the possibility of integrating deep learning- and template-based approaches to improve performance remains to be explored. To address these issues, we developed a novel structure-based integrative algorithm called NABind that can accurately predict DNA- and RNA-binding residues. A deep learning module was built based on the diversified sequence and structural descriptors and edge aggregated graph attention networks, while a template module was constructed by transforming the alignments between the query and its multiple templates into features for supervised learning. Furthermore, the stacking strategy was adopted to integrate the above two modules for improving prediction performance. Finally, a post-processing module dependent on the random walk algorithm was proposed to further correct the integrative predictions. Extensive evaluations indicated that our approach could not only achieve excellent performance on both native and predicted structures but also outperformed existing hybrid algorithms and recent deep learning methods. The NABind server is available at http://liulab.hzau.edu.cn/NABind/.Author summary: Ten years ago we developed two hybrid algorithms (DNABind and RBRDetector) to predict nucleic acid binding residues by combining machine learning- and template-based strategies. However, this kind of algorithms have been surpassed by recent deep learning methods. Moreover, the interplay between deep learning- and template-based approaches has yet to be explored. We thus designed a new generation hybrid algorithm termed NABind, in which a deep learning module was established by using diversified sequence and structural descriptors and edge-featured graph attention networks, while a template module was created by exploiting the relationship between the query protein and its multiple templates for supervised learning. Afterward, a merging module based on the stacking strategy was adopted to integrate the above two modules, and a post-processing module dependent on the random walk algorithm was utilized to correct the integrative predictions. The new algorithm outperformed traditional hybrid methods by a large margin and showed better results than purely deep learning-based methods.

Date: 2023
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011428 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 11428&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1011428

DOI: 10.1371/journal.pcbi.1011428

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().