EconPapers    
Economics at your fingertips  
 

PICNIC accurately predicts condensate-forming proteins regardless of their structural disorder across organisms

Anna Hadarovich, Hari Raj Singh, Soumyadeep Ghosh, Maxim Scheremetjew, Nadia Rostam, Anthony A. Hyman and Agnes Toth-Petroczy ()
Additional contact information
Anna Hadarovich: Max Planck Institute of Molecular Cell Biology and Genetics
Hari Raj Singh: Max Planck Institute of Molecular Cell Biology and Genetics
Soumyadeep Ghosh: Max Planck Institute of Molecular Cell Biology and Genetics
Maxim Scheremetjew: Max Planck Institute of Molecular Cell Biology and Genetics
Nadia Rostam: Max Planck Institute of Molecular Cell Biology and Genetics
Anthony A. Hyman: Max Planck Institute of Molecular Cell Biology and Genetics
Agnes Toth-Petroczy: Max Planck Institute of Molecular Cell Biology and Genetics

Nature Communications, 2024, vol. 15, issue 1, 1-16

Abstract: Abstract Biomolecular condensates are membraneless organelles that can concentrate hundreds of different proteins in cells to operate essential biological functions. However, accurate identification of their components remains challenging and biased towards proteins with high structural disorder content with focus on self-phase separating (driver) proteins. Here, we present a machine learning algorithm, PICNIC (Proteins Involved in CoNdensates In Cells) to classify proteins that localize to biomolecular condensates regardless of their role in condensate formation. PICNIC successfully predicts condensate members by learning amino acid patterns in the protein sequence and structure in addition to the intrinsic disorder. Extensive experimental validation of 24 positive predictions in cellulo shows an overall ~82% accuracy regardless of the structural disorder content of the tested proteins. While increasing disorder content is associated with organismal complexity, our analysis of 26 species reveals no correlation between predicted condensate proteome content and disorder content across organisms. Overall, we present a machine learning classifier to interrogate condensate components at whole-proteome levels across the tree of life.

Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-024-55089-x Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-55089-x

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-024-55089-x

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-55089-x