Prediction of off-target specificity and cell-specific fitness of CRISPR-Cas System using attention boosted deep learning and network-based gene feature
Qiao Liu,
Di He and
Lei Xie
PLOS Computational Biology, 2019, vol. 15, issue 10, 1-22
Abstract:
CRISPR-Cas is a powerful genome editing technology and has a great potential for in vivo gene therapy. Successful translational application of CRISPR-Cas to biomedicine still faces many safety concerns, including off-target side effect, cell fitness problem after CRISPR-Cas treatment, and on-target genome editing side effect in undesired tissues. To solve these issues, it is needed to design sgRNA with high cell-specific efficacy and specificity. Existing single-guide RNA (sgRNA) design tools mainly depend on a sgRNA sequence and the local information of the targeted genome, thus are not sufficient to account for the difference in the cellular response of the same gene in different cell types. To incorporate cell-specific information into the sgRNA design, we develop novel interpretable machine learning models, which integrate features learned from advanced transformer-based deep neural network with cell-specific gene property derived from biological network and gene expression profile, for the prediction of CRISPR-Cas9 and CRISPR-Cas12a efficacy and specificity. In benchmark studies, our models significantly outperform state-of-the-art algorithms. Furthermore, we find that the network-based gene property is critical for the prediction of cell-specific post-treatment cellular response. Our results suggest that the design of efficient and safe CRISPR-Cas needs to consider cell-specific information of genes. Our findings may bolster developing more accurate predictive models of CRISPR-Cas across a broad spectrum of biological conditions as well as provide new insight into developing efficient and safe CRISPR-based gene therapy.Author summary: CRISPR-Cas is a powerful genome editing technology and has a great potential for in vivo gene therapy. To translate CRISPR-Cas into an efficient and safe therapeutic, it is critical to select target genes and design target-specific single guide RNAs such that they could maximize on-target in vivo efficiency as well as minimize the side effect induced by either off-target or on-target genome editing in undesired tissues. Due to experimental and clinical limitations, the CRISPR-Cas target efficiency and specificity in an intended condition (e.g. human) often need to be inferred from results in different conditions (e.g. animal model). This translational process imposes a big challenge in experimental design and potential risk in clinical development. To improve the cell-specific predictability of machine learning models and reveal important biological feature that determines the transferability of CRISPR-Cas9 across different cells, we develop an accurate and interpretable machine learning model that integrates features extracted from attention-based deep learning and knowledge-based cell-specific gene property. Our models significantly improve the performance of off-target specificity and cell-specific on-target efficiency prediction. We discover that network-based gene property is a key determinant of model predictability. Our finding may provide new insight into developing efficient and safe CRISPR-based gene therapy.
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007480 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 07480&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1007480
DOI: 10.1371/journal.pcbi.1007480
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().