Universal Count Correction for High-Throughput Sequencing
Tatsunori B Hashimoto,
Matthew D Edwards and
David K Gifford
PLOS Computational Biology, 2014, vol. 10, issue 3, 1-11
Abstract:
We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base sequencing read count data called Fixseq. We demonstrate that Fixseq substantially improves the performance of existing RNA-seq, DNase-seq, and ChIP-seq analysis tools when compared with existing alternatives.Author Summary: High-throughput DNA sequencing has been adapted to measure diverse biological state information including RNA expression, chromatin accessibility, and transcription factor binding to the genome. The accurate inference of biological mechanism from sequence counts requires a model of how sequence counts are distributed. We show that presently used sequence count distribution models are typically inaccurate and present a new method called Fixseq to process counts to more closely follow existing count models. On typical datasets Fixseq improves the performance of existing tools for RNA-seq, DNase-seq, and ChIP-seq, while yielding complementary additional gains in cases where domain-specific tools are available.
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003494 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 03494&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1003494
DOI: 10.1371/journal.pcbi.1003494
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().