BTFBS: Binding Prediction of Bacterial Transcription Factors and Binding Sites Based on Deep Learning
Bingbing Jin,
Song Liang,
Xiaoqian Liu,
Rui Zhang,
Yun Zhu,
Yuanyuan Chen,
Guangjin Liu () and
Tao Yang ()
Additional contact information
Bingbing Jin: College of Sciences, Nanjing Agricultural University, Nanjing 210095, China
Song Liang: College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
Xiaoqian Liu: College of Sciences, Nanjing Agricultural University, Nanjing 210095, China
Rui Zhang: College of Sciences, Nanjing Agricultural University, Nanjing 210095, China
Yun Zhu: College of Sciences, Nanjing Agricultural University, Nanjing 210095, China
Yuanyuan Chen: College of Sciences, Nanjing Agricultural University, Nanjing 210095, China
Guangjin Liu: College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China
Tao Yang: College of Sciences, Nanjing Agricultural University, Nanjing 210095, China
Mathematics, 2025, vol. 13, issue 4, 1-15
Abstract:
The binding of transcription factors (TFs) to TF binding sites plays a vital role in the process of regulating gene expression and evolution. With the development of machine learning and deep learning, some successes have been achieved in predicting transcription factors and binding sites. In this paper, we develop a model, BTFBS, which predicts whether the bacterial transcription factors and binding sites combine or not. The model takes both the amino acid sequences of bacterial transcription factors and the nucleotide sequences of binding sites as inputs, and extracts features through convolutional neural network and MultiheadAttention. For the model inputs, we use two negative sample sampling methods: RS and EE. On the test dataset of RS, the accuracy, sensitivity, specificity, F1-score, and MCC of BTFBS are 0.91446, 0.89746, 0.93134, 0.91264, and 0.82946, respectively. Furthermore, on the test dataset of EE, the accuracy, sensitivity, specificity, F1-score and MCC of BTFBS are 0.87868, 0.89354, 0.86394, 0.87996, and 0.75796, respectively. Meanwhile, our findings indicate that the optimal approach for obtaining negative samples in the context of bacterial research is to utilize the whole genome sequences of the corresponding bacteria, as opposed to the shuffling method. The above results on the test dataset have shown that the proposed BTFBS model has a good performance and it can provide an experimental guide.
Keywords: transcription factors; binding sites; deep learning; bacteria (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/4/589/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/4/589/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:4:p:589-:d:1588381
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().