A multi-task convolutional deep neural network for variant calling in single molecule sequencing
Ruibang Luo (),
Fritz J. Sedlazeck,
Tak-Wah Lam and
Michael C. Schatz
Additional contact information
Ruibang Luo: The University of Hong Kong
Fritz J. Sedlazeck: Baylor College of Medicine
Tak-Wah Lam: The University of Hong Kong
Michael C. Schatz: Johns Hopkins University
Nature Communications, 2019, vol. 10, issue 1, 1-11
Abstract:
Abstract The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source ( https://github.com/aquaskyline/Clairvoyante ), with modules to train, utilize and visualize the model.
Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://www.nature.com/articles/s41467-019-09025-z Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:10:y:2019:i:1:d:10.1038_s41467-019-09025-z
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-019-09025-z
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().