EconPapers    
Economics at your fingertips  
 

A multi-task convolutional deep neural network for variant calling in single molecule sequencing

Ruibang Luo (), Fritz J. Sedlazeck, Tak-Wah Lam and Michael C. Schatz
Additional contact information
Ruibang Luo: The University of Hong Kong
Fritz J. Sedlazeck: Baylor College of Medicine
Tak-Wah Lam: The University of Hong Kong
Michael C. Schatz: Johns Hopkins University

Nature Communications, 2019, vol. 10, issue 1, 1-11

Abstract: Abstract The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source ( https://github.com/aquaskyline/Clairvoyante ), with modules to train, utilize and visualize the model.

Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.nature.com/articles/s41467-019-09025-z Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:10:y:2019:i:1:d:10.1038_s41467-019-09025-z

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-019-09025-z

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:10:y:2019:i:1:d:10.1038_s41467-019-09025-z