Error analysis of the PacBio sequencing CCS reads
Pourmohammadi Reza (),
Abouei Jamshid () and
Anpalagan Alagan ()
Additional contact information
Pourmohammadi Reza: WINEL Research Laboratory at the Department of Electrical Engineering, Yazd University, Yazd, Iran
Abouei Jamshid: WINEL Research Laboratory at the Department of Electrical Engineering, Yazd University, Yazd, Iran
Anpalagan Alagan: Department of Electrical, Computer and Biomedical Engineering, Ryerson University, Toronto, Canada
The International Journal of Biostatistics, 2023, vol. 19, issue 2, 439-453
Abstract:
Third generation sequencing technologies such as Pacific Biosciences and Oxford Nanopore provide faster, cost-effective and simpler assembly process generating longer reads than the ones in the next generation sequencing. However, the error rates of these long reads are higher than those of the short reads, resulting in an error correcting process before the assembly such as using the Circular Consensus Sequencing (CCS) reads in PacBio sequencing machines. In this paper, we propose a probabilistic model for the error occurrence along the CCS reads. We obtain the error probability of any arbitrary nucleotide as well as the base calling Phred quality score of the nucleotides along the CCS reads in terms of the number of sub-reads. Furthermore, we derive the error rate distribution of the reads in relation to the pass number. It follows the binomial distribution which can be approximated by the normal distribution for long reads. Finally, we evaluate our proposed model by comparing it with three real PacBio datasets, namely, Lambda, and E. coli genomes, and Alzheimer’s disease targeted experiment.
Keywords: CCS reads accuracy; CCS reads quality; PacBio error model; sequencing noise (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/ijb-2021-0091 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:ijbist:v:19:y:2023:i:2:p:439-453:n:3
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/ijb/html
DOI: 10.1515/ijb-2021-0091
Access Statistics for this article
The International Journal of Biostatistics is currently edited by Antoine Chambaz, Alan E. Hubbard and Mark J. van der Laan
More articles in The International Journal of Biostatistics from De Gruyter
Bibliographic data for series maintained by Peter Golla ().