Identification of BOLD engine deficiencies and suggestions for improvement based on a curated Tachina (Diptera) record set
Frederik Stein and 
Oliver Gailing
PLOS ONE, 2025, vol. 20, issue 8, 1-18
Abstract:
The increasing number of Barcode of Life Database (BOLD) records per species and genus leads to contradictory species assignments within Barcode Index Numbers (BINs), serving as identifiers for the BOLD ID engine. To examine these issues, we analyzed a dataset comprising original and curated BOLD records for the genus Tachina (Insecta: Tachinidae), based on a previous publication. This dataset included both published and private records. We were able to assess the performance of the BOLD engine’s species determination algorithm, Refined Single Linkage (RESL), and compare it to Assemble Species by Automatic Partitioning (ASAP). Additionally, we investigated the usage of BINs by the BOLD v4 ID engine. Our analysis confirmed that BOLD queries primarily rely on BINs for species identification, although some cases deviated from this pattern, resulting in species matches inconsistent with the assigned BIN species. ASAP was found to be superior to RESL due to RESL’s adherence to the concept of the DNA barcoding gap. Moreover, we found that taxonomic misassignments, inconsistencies in BIN formation, and missing metadata also contribute significantly to unreliable identifications. These problems appear to stem from both algorithmic limitations and deficiencies in submission and post-submission processes. Moreover, we noted that the default mode of the BOLD v4 ID engine integrates both private and published data, leading to public records based solely on COI-based identifications. However, this issue may now be mitigated, as the BOLD v5 ID engine default mode exclusively employs published data. To enhance BOLD’s reliability, we propose improvements to submission and post-submission processes. Without such amendments, the accumulation of contradictory species assignments within BINs will continue to rise and the reliability of specimen identification by BOLD will decrease.
Date: 2025
References: View complete reference list from CitEc 
Citations: 
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0331216 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 31216&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX 
RIS (EndNote, ProCite, RefMan) 
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0331216
DOI: 10.1371/journal.pone.0331216
Access Statistics for this article
More articles in PLOS ONE  from  Public Library of Science
Bibliographic data for series maintained by plosone ().