A sequential naïve Bayes classifier for DNA barcodes
Anderson Michael P. () and
Dubnicka Suzanne R.
Additional contact information
Anderson Michael P.: Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center Oklahoma City, OK, USA
Dubnicka Suzanne R.: Department of Statistics, Kansas State University, Manhattan, KS, USA
Statistical Applications in Genetics and Molecular Biology, 2014, vol. 13, issue 4, 423-434
Abstract:
DNA barcodes are short strands of 255–700 nucleotide bases taken from the cytochrome c oxidase subunit 1 (COI) region of the mitochondrial DNA. It has been proposed that these barcodes may be used as a method of differentiating between biological species. Current methods of species classification utilize distance measures that are heavily dependent on both evolutionary model assumptions as well as a clearly defined “gap” between intra- and interspecies variation. Such distance measures fail to measure classification uncertainty or to indicate how much of the barcode is necessary for classification. We propose a sequential naïve Bayes classifier for species classification to address these limitations. The proposed method is shown to provide accurate species-level classification on real and simulated data. The method proposed here quantifies the uncertainty of each classification and addresses how much of the barcode is necessary.
Keywords: Naïve Bayes classifier; DNA barcoding; phylogenetic analysis; sequential analysis; species classification; species discovery (search for similar items in EconPapers)
Date: 2014
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/sagmb-2013-0025 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:13:y:2014:i:4:p:12:n:3
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.1515/sagmb-2013-0025
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().