Efficient assembly of nanopore reads via highly accurate and intact error correction
Ying Chen,
Fan Nie,
Shang-Qian Xie,
Ying-Feng Zheng,
Qi Dai,
Thomas Bray,
Yao-Xin Wang,
Jian-Feng Xing,
Zhi-Jian Huang,
Wang De-Peng,
Li-Juan He,
Feng Luo (),
Jian-Xin Wang (),
Yi-Zhi Liu () and
Chuan-Le Xiao ()
Additional contact information
Ying Chen: Sun Yat-sen University
Fan Nie: Central South University
Shang-Qian Xie: Hainan University
Ying-Feng Zheng: Sun Yat-sen University
Qi Dai: Zhejiang Sci-Tech University
Thomas Bray: Oxford Nanopore Technologies
Yao-Xin Wang: Zhejiang Sci-Tech University
Jian-Feng Xing: Hainan University
Zhi-Jian Huang: Sun Yat-sen University
Wang De-Peng: Nextomics Biosciences Co., Ltd
Li-Juan He: Sun Yat-sen University
Feng Luo: Clemson University
Jian-Xin Wang: Central South University
Yi-Zhi Liu: Sun Yat-sen University
Chuan-Le Xiao: Sun Yat-sen University
Nature Communications, 2021, vol. 12, issue 1, 1-10
Abstract:
Abstract Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences during error correction, which reduces both the length of the reads and contiguity of the final assembly. Here, we develop an error correction, and de novo assembly tool designed to overcome complex errors in nanopore reads. We propose an adaptive read selection and two-step progressive method to quickly correct nanopore reads to high accuracy. We introduce a two-stage assembler to utilize the full length of nanopore reads. Our tool achieves superior performance in both error correction and de novo assembling nanopore reads. It requires only 8122 hours to assemble a 35X coverage human genome and achieves a 2.47-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line shows an NG50 of 22 Mbp. The high-quality assembly of nanopore reads can significantly reduce false positives in structure variation detection.
Date: 2021
References: Add references at CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
https://www.nature.com/articles/s41467-020-20236-7 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-020-20236-7
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-020-20236-7
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().