Transformer-based deep learning enables improved B-cell epitope prediction in parasitic pathogens: A proof-of-concept study on Fasciola hepatica
Rui-Si Hu,
Kui Gu,
Muhammad Ehsan,
Sayed Haidar Abbas Raza and
Chun-Ren Wang
PLOS Neglected Tropical Diseases, 2025, vol. 19, issue 4, 1-21
Abstract:
Background: The identification of B-cell epitopes (BCEs) is fundamental to advancing epitope-based vaccine design, therapeutic antibody development, and diagnostics, such as in neglected tropical diseases caused by parasitic pathogens. However, the structural complexity of parasite antigens and the high cost of experimental validation present certain challenges. Advances in Artificial Intelligence (AI)-driven protein engineering, particularly through machine learning and deep learning, offer efficient solutions to enhance prediction accuracy and reduce experimental costs. Methodology/Principal findings: Here, we present deepBCE-Parasite, a Transformer-based deep learning model designed to predict linear BCEs from peptide sequences. By leveraging a state-of-the-art self-attention mechanism, the model achieved remarkable predictive performance, achieving an accuracy of approximately 81% and an AUC of 0.90 in both 10-fold cross-validation and independent testing. Comparative analyses against 12 handcrafted features and four conventional machine learning algorithms (GNB, SVM, RF, and LGBM) highlighted the superior predictive power of the model. As a case study, deepBCE-Parasite predicted eight BCEs from the leucine aminopeptidase (LAP) protein in Fasciola hepatica proteomic data. Dot-blot immunoassays confirmed the specific binding of seven synthetic peptides to positive sera, validating their IgG reactivity and demonstrating the model’s efficacy in BCE prediction. Conclusions/Significance: deepBCE-Parasite demonstrates excellent performance in predicting BCEs across diverse parasitic pathogens, offering a valuable tool for advancing the design of epitope-based vaccines, antibodies, and diagnostic applications in parasitology. Author Summary: Antigen-antibody interactions are critical events in the humoral immune response, facilitating the recognition and neutralization of invasive parasites. BCEs, defined as surface-exposed clusters of amino acids recognized by B-cell receptors or antibodies, play a critical role in initiating a humoral immune response. This study focuses on the identification of parasite BCEs, which serve as promising targets for the development of vaccines, therapeutic antibodies, and diagnostic tools. To this end, we developed a deep learning model, termed deepBCE-Parasite, which was rigorously benchmarked against and integrated with traditional machine learning models. By leveraging state-of-the-art AI techniques, these models enable rapid and precise BCE identification directly from amino acid sequences, rendering it particularly suitable for large-scale epitope screening. As a proof-of-concept, we applied these AI-driven models to predict BCEs in F. hepatica, a globally distributed parasite responsible for fascioliasis, a neglected tropical disease. Utilizing available proteomics data of this trematode species, we identified peptides exhibiting high specificity for antibody binding. This work highlights the potential of AI in advancing epitope prediction within parasitology, providing a rapid, scalable, and cost-effective strategy for discovering immune targets.
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0012985 (text/html)
https://journals.plos.org/plosntds/article/file?id ... 12985&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pntd00:0012985
DOI: 10.1371/journal.pntd.0012985
Access Statistics for this article
More articles in PLOS Neglected Tropical Diseases from Public Library of Science
Bibliographic data for series maintained by plosntds ().