Rapid and accurate prediction of protein homo-oligomer symmetry using Seq2Symm
Meghana Kshirsagar (),
Artur Meller,
Ian R. Humphreys,
Samuel Sledzieski,
Yixi Xu,
Rahul Dodhia,
Eric Horvitz,
Bonnie Berger,
Gregory R. Bowman,
Juan Lavista Ferres,
David Baker and
Minkyung Baek ()
Additional contact information
Meghana Kshirsagar: Microsoft Corporation
Artur Meller: Washington University in St. Louis
Ian R. Humphreys: University of Washington
Samuel Sledzieski: Microsoft Corporation
Yixi Xu: Microsoft Corporation
Rahul Dodhia: Microsoft Corporation
Eric Horvitz: Microsoft Corp
Bonnie Berger: Massachusetts Institute of Technology
Gregory R. Bowman: University of Pennsylvania
Juan Lavista Ferres: Microsoft Corporation
David Baker: University of Washington
Minkyung Baek: Seoul National University
Nature Communications, 2025, vol. 16, issue 1, 1-11
Abstract:
Abstract The majority of proteins must form higher-order assemblies to perform their biological functions, yet few machine learning models can accurately and rapidly predict the symmetry of assemblies involving multiple copies of the same protein chain. Here, we address this gap by finetuning several classes of protein foundation models, to predict homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes ESM2, outperforms existing template-based and deep learning methods achieving an average AUC-PR of 0.47, 0.44 and 0.49 across homo-oligomer symmetries on three held-out test sets compared to 0.24, 0.24 and 0.25 with template-based search. Seq2Symm uses a single sequence as input and can predict at the rate of ~80,000 proteins/hour. We apply this method to 5 proteomes and ~3.5 million unlabeled protein sequences, showing its promise to be used in conjunction with downstream computationally intensive all-atom structure generation methods such as RoseTTAFold2 and AlphaFold2-multimer. Code, datasets, model are available at: https://github.com/microsoft/seq2symm .
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-57148-3 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57148-3
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-57148-3
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().