EconPapers    
Economics at your fingertips  
 

Dirichlet latent modelling enables effective learning and sampling of the functional protein design space

Evgenii Lobzaev and Giovanni Stracquadanio ()
Additional contact information
Evgenii Lobzaev: The University of Edinburgh
Giovanni Stracquadanio: The University of Edinburgh

Nature Communications, 2024, vol. 15, issue 1, 1-11

Abstract: Abstract Engineering proteins with desired functions and biochemical properties is pivotal for biotechnology and drug discovery. While computational methods based on evolutionary information are reducing the experimental burden by designing targeted libraries of functional variants, they still have a low success rate when the desired protein has few or very remote homologous sequences. Here we propose an autoregressive model, called Temporal Dirichlet Variational Autoencoder (TDVAE), which exploits the mathematical properties of the Dirichlet distribution and temporal convolution to efficiently learn high-order information from a functionally related, possibly remotely similar, set of sequences. TDVAE is highly accurate in predicting the effects of amino acid mutations, while being significantly 90% smaller than the other state-of-the-art models. We then use TDVAE to design variants of the human alpha galactosidase enzymes as potential treatment for Fabry disease. Our model builds a library of diverse variants which retain sequence, biochemical and structural properties of the wildtype protein, suggesting they could be suitable for enzyme replacement therapy. Taken together, our results show the importance of accurate sequence modelling and the potential of autoregressive models as protein engineering and analysis tools.

Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-024-53622-6 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-53622-6

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-024-53622-6

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-53622-6