EconPapers    
Economics at your fingertips  
 

Illuminating protein space with a programmable generative model

John B. Ingraham, Max Baranov, Zak Costello, Karl W. Barber, Wujie Wang, Ahmed Ismail, Vincent Frappier, Dana M. Lord, Christopher Ng-Thow-Hing, Erik R. Van Vlack, Shan Tie, Vincent Xue, Sarah C. Cowles, Alan Leung, João V. Rodrigues, Claudio L. Morales-Perez, Alex M. Ayoub, Robin Green, Katherine Puentes, Frank Oplinger, Nishant V. Panwar, Fritz Obermeyer, Adam R. Root, Andrew L. Beam, Frank J. Poelwijk and Gevorg Grigoryan ()
Additional contact information
John B. Ingraham: Generate Biomedicines
Max Baranov: Generate Biomedicines
Zak Costello: Generate Biomedicines
Karl W. Barber: Generate Biomedicines
Wujie Wang: Generate Biomedicines
Ahmed Ismail: Generate Biomedicines
Vincent Frappier: Generate Biomedicines
Dana M. Lord: Generate Biomedicines
Christopher Ng-Thow-Hing: Generate Biomedicines
Erik R. Van Vlack: Generate Biomedicines
Shan Tie: Generate Biomedicines
Vincent Xue: Generate Biomedicines
Sarah C. Cowles: Generate Biomedicines
Alan Leung: Generate Biomedicines
João V. Rodrigues: Generate Biomedicines
Claudio L. Morales-Perez: Generate Biomedicines
Alex M. Ayoub: Generate Biomedicines
Robin Green: Generate Biomedicines
Katherine Puentes: Generate Biomedicines
Frank Oplinger: Generate Biomedicines
Nishant V. Panwar: Generate Biomedicines
Fritz Obermeyer: Generate Biomedicines
Adam R. Root: Generate Biomedicines
Andrew L. Beam: Generate Biomedicines
Frank J. Poelwijk: Generate Biomedicines
Gevorg Grigoryan: Generate Biomedicines

Nature, 2023, vol. 623, issue 7989, 1070-1078

Abstract: Abstract Three billion years of evolution has produced a tremendous diversity of protein molecules1, but the full potential of proteins is likely to be much greater. Accessing this potential has been challenging for both computation and experiments because the space of possible protein molecules is much larger than the space of those likely to have functions. Here we introduce Chroma, a generative model for proteins and protein complexes that can directly sample novel protein structures and sequences, and that can be conditioned to steer the generative process towards desired properties and functions. To enable this, we introduce a diffusion process that respects the conformational statistics of polymer ensembles, an efficient neural architecture for molecular systems that enables long-range reasoning with sub-quadratic scaling, layers for efficiently synthesizing three-dimensional structures of proteins from predicted inter-residue geometries and a general low-temperature sampling algorithm for diffusion models. Chroma achieves protein design as Bayesian inference under external constraints, which can involve symmetries, substructure, shape, semantics and even natural-language prompts. The experimental characterization of 310 proteins shows that sampling from Chroma results in proteins that are highly expressed, fold and have favourable biophysical properties. The crystal structures of two designed proteins exhibit atomistic agreement with Chroma samples (a backbone root-mean-square deviation of around 1.0 Å). With this unified approach to protein design, we hope to accelerate the programming of protein matter to benefit human health, materials science and synthetic biology.

Date: 2023
References: Add references at CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.nature.com/articles/s41586-023-06728-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:623:y:2023:i:7989:d:10.1038_s41586-023-06728-8

Ordering information: This journal article can be ordered from
https://www.nature.com/

DOI: 10.1038/s41586-023-06728-8

Access Statistics for this article

Nature is currently edited by Magdalena Skipper

More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:nature:v:623:y:2023:i:7989:d:10.1038_s41586-023-06728-8