EconPapers    
Economics at your fingertips  
 

PLM-interact: extending protein language models to predict protein-protein interactions

Dan Liu, Francesca Young, Kieran D. Lamb, Adalberto Claudio Quiros, Alexandrina Pancheva, Crispin J. Miller, Craig Macdonald (), David L. Robertson () and Ke Yuan ()
Additional contact information
Dan Liu: MRC-University of Glasgow Centre for Virus Research
Francesca Young: MRC-University of Glasgow Centre for Virus Research
Kieran D. Lamb: MRC-University of Glasgow Centre for Virus Research
Adalberto Claudio Quiros: University of Glasgow
Alexandrina Pancheva: Cancer Research UK Scotland Institute
Crispin J. Miller: University of Glasgow
Craig Macdonald: University of Glasgow
David L. Robertson: MRC-University of Glasgow Centre for Virus Research
Ke Yuan: University of Glasgow

Nature Communications, 2025, vol. 16, issue 1, 1-14

Abstract: Abstract Computational prediction of protein structure from amino acid sequence alone has been achieved with unprecedented accuracy, yet the prediction of protein-protein interactions remains a challenge. Here, we assess the ability of protein language models (PLMs), routinely applied to protein folding, to be retrained for protein-protein interaction prediction. Existing models that exploit PLMs use a pre-trained PLM feature set, ignoring that the proteins are physically interacting. We propose PLM-interact, which goes beyond single proteins by jointly encoding protein pairs to learn their relationships, analogous to the next-sentence prediction task from natural language processing. This approach achieves state-of-the-art performance in a widely adopted cross-species protein-protein interaction prediction benchmark: trained on human data and tested on mouse, fly, worm, E. coli and yeast. In addition, we develop a fine-tuning method for PLM-interact to detect mutation effects on interactions. Finally, we report that the model outperforms existing approaches in predicting virus-host interaction at the protein level. Our work demonstrates that large language models can be extended to learn the intricate relationships among biomolecules from their sequences alone.

Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-025-64512-w Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64512-w

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-025-64512-w

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-12-06
Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64512-w