EconPapers    
Economics at your fingertips  
 

Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement

Kunpeng Li, Peng Xu, Jinpeng Wang, Xin Yi and Yuannian Jiao ()
Additional contact information
Kunpeng Li: the Chinese Academy of Sciences
Peng Xu: the Chinese Academy of Sciences
Jinpeng Wang: the Chinese Academy of Sciences
Xin Yi: the Chinese Academy of Sciences
Yuannian Jiao: the Chinese Academy of Sciences

Nature Communications, 2023, vol. 14, issue 1, 1-12

Abstract: Abstract Assembly of a high-quality genome is important for downstream comparative and functional genomic studies. However, most tools for genome assembly assessment only give qualitative reports, which do not pinpoint assembly errors at specific regions. Here, we develop a new reference-free tool, Clipping information for Revealing Assembly Quality (CRAQ), which maps raw reads back to assembled sequences to identify regional and structural assembly errors based on effective clipped alignment information. Error counts are transformed into corresponding assembly evaluation indexes to reflect the assembly quality at single-nucleotide resolution. Notably, CRAQ distinguishes assembly errors from heterozygous sites or structural differences between haplotypes. This tool can clearly indicate low-quality regions and potential structural error breakpoints; thus, it can identify misjoined regions that should be split for further scaffold building and improvement of the assembly. We have benchmarked CRAQ on multiple genomes assembled using different strategies, and demonstrated the misjoin correction for improving the constructed pseudomolecules.

Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-023-42336-w Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42336-w

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-023-42336-w

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42336-w