Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization
Claire N Bedbrook,
Kevin K Yang,
Austin J Rice,
Viviana Gradinaru and
Frances H Arnold
PLOS Computational Biology, 2017, vol. 13, issue 10, 1-21
Abstract:
There is growing interest in studying and engineering integral membrane proteins (MPs) that play key roles in sensing and regulating cellular response to diverse external signals. A MP must be expressed, correctly inserted and folded in a lipid bilayer, and trafficked to the proper cellular location in order to function. The sequence and structural determinants of these processes are complex and highly constrained. Here we describe a predictive, machine-learning approach that captures this complexity to facilitate successful MP engineering and design. Machine learning on carefully-chosen training sequences made by structure-guided SCHEMA recombination has enabled us to accurately predict the rare sequences in a diverse library of channelrhodopsins (ChRs) that express and localize to the plasma membrane of mammalian cells. These light-gated channel proteins of microbial origin are of interest for neuroscience applications, where expression and localization to the plasma membrane is a prerequisite for function. We trained Gaussian process (GP) classification and regression models with expression and localization data from 218 ChR chimeras chosen from a 118,098-variant library designed by SCHEMA recombination of three parent ChRs. We use these GP models to identify ChRs that express and localize well and show that our models can elucidate sequence and structure elements important for these processes. We also used the predictive models to convert a naturally occurring ChR incapable of mammalian localization into one that localizes well.Author summary: A protein’s amino acid sequence determines how it will fold, traffic to subcellular locations, and carry out specific functions within the cell. Understanding this process would enable the design of protein sequences capable of useful functions; unfortunately, we cannot predict in detail how sequence encodes function. However, machine-learning models have the potential to infer the complex protein sequence-function relationship by identifying patterns or features that are important for function from sequences with known functions. We used machine learning to learn about and design membrane proteins (MPs). To function, a MP must be expressed, correctly folded in a lipid membrane, and trafficked to the proper cellular location. We built predictive, machine-learning models for this complex process from a set of >200 chimeric MPs and used them to design new sequences with optimal performance on the challenging task of membrane localization. This general approach to understanding and designing MPs could be broadly useful for important pharmaceutical and engineering MP targets.
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005786 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 05786&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1005786
DOI: 10.1371/journal.pcbi.1005786
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().