A multimodal Transformer Network for protein-small molecule interactions enhances predictions of kinase inhibition and enzyme-substrate relationships
Alexander Kroll,
Sahasra Ranjan and
Martin J Lercher
PLOS Computational Biology, 2024, vol. 20, issue 5, 1-23
Abstract:
The activities of most enzymes and drugs depend on interactions between proteins and small molecules. Accurate prediction of these interactions could greatly accelerate pharmaceutical and biotechnological research. Current machine learning models designed for this task have a limited ability to generalize beyond the proteins used for training. This limitation is likely due to a lack of information exchange between the protein and the small molecule during the generation of the required numerical representations. Here, we introduce ProSmith, a machine learning framework that employs a multimodal Transformer Network to simultaneously process protein amino acid sequences and small molecule strings in the same input. This approach facilitates the exchange of all relevant information between the two molecule types during the computation of their numerical representations, allowing the model to account for their structural and functional interactions. Our final model combines gradient boosting predictions based on the resulting multimodal Transformer Network with independent predictions based on separate deep learning representations of the proteins and small molecules. The resulting predictions outperform recently published state-of-the-art models for predicting protein-small molecule interactions across three diverse tasks: predicting kinase inhibitions; inferring potential substrates for enzymes; and predicting Michaelis constants KM. The Python code provided can be used to easily implement and improve machine learning predictions involving arbitrary protein-small molecule interactions.Author summary: Understanding how proteins interact with small molecules, such as drugs, is critical to advancing medical, biological, and biotechnological research. Our work introduces ProSmith, a machine learning framework that improves the prediction of protein-small molecule interactions. Protein-small molecule interactions can be predicted by using numerical representations of proteins and small molecules as input to machine learning prediction models. Previous methods typically generated separate numerical representations for the proteins and small molecules without considering their interactions. ProSmith, however, combines both protein sequence and small molecule structural information in the input of a single multimodal Transformer Network to generate a joint numerical representation. Unlike previous methods, this allows for a comprehensive exchange of information between protein and small molecule, capturing the complex relationships and interactions between these two types of molecules. ProSmith successfully predicts several biological interactions, including kinase inhibitions, potential enzyme-substrate pairs, and enzyme kinetic parameters KM. We provide Python code that can be easily adapted to improve predictions for any protein-small molecule interaction.
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1012100 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 12100&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1012100
DOI: 10.1371/journal.pcbi.1012100
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().