A generalizable Cas9/sgRNA prediction model using machine transfer learning with small high-quality datasets
Dalton T. Ham,
Tyler S. Browne,
Pooja N. Banglorewala,
Tyler L. Wilson,
Richard K. Michael,
Gregory B. Gloor (ggloor@uwo.ca) and
David R. Edgell (dedgell@uwo.ca)
Additional contact information
Dalton T. Ham: Schulich School of Medicine and Dentistry
Tyler S. Browne: Schulich School of Medicine and Dentistry
Pooja N. Banglorewala: Schulich School of Medicine and Dentistry
Tyler L. Wilson: Tesseraqt Optimization Inc
Richard K. Michael: Tesseraqt Optimization Inc
Gregory B. Gloor: Schulich School of Medicine and Dentistry
David R. Edgell: Schulich School of Medicine and Dentistry
Nature Communications, 2023, vol. 14, issue 1, 1-16
Abstract:
Abstract The CRISPR/Cas9 nuclease from Streptococcus pyogenes (SpCas9) can be used with single guide RNAs (sgRNAs) as a sequence-specific antimicrobial agent and as a genome-engineering tool. However, current bacterial sgRNA activity models struggle with accurate predictions and do not generalize well, possibly because the underlying datasets used to train the models do not accurately measure SpCas9/sgRNA activity and cannot distinguish on-target cleavage from toxicity. Here, we solve this problem by using a two-plasmid positive selection system to generate high-quality data that more accurately reports on SpCas9/sgRNA cleavage and that separates activity from toxicity. We develop a machine learning architecture (crisprHAL) that can be trained on existing datasets, that shows marked improvements in sgRNA activity prediction accuracy when transfer learning is used with small amounts of high-quality data, and that can generalize predictions to different bacteria. The crisprHAL model recapitulates known SpCas9/sgRNA-target DNA interactions and provides a pathway to a generalizable sgRNA bacterial activity prediction tool that will enable accurate antimicrobial and genome engineering applications.
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-023-41143-7 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-41143-7
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-023-41143-7
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla (sonal.shukla@springer.com) and Springer Nature Abstracting and Indexing (indexing@springernature.com).