Two-point-based binary search trees for accelerating big data classification using KNN
Ahmad B A Hassanat
PLOS ONE, 2018, vol. 13, issue 11, 1-15
Abstract:
Big data classification is very slow when using traditional machine learning classifiers, particularly when using a lazy and slow-by-nature classifier such as the k-nearest neighbors algorithm (KNN). This paper proposes a new approach which is based on sorting the feature vectors of training data in a binary search tree to accelerate big data classification using the KNN approach. This is done using two methods, both of which utilize two local points to sort the examples based on their similarity to these local points. The first method chooses the local points based on their similarity to the global extreme points, while the second method chooses the local points randomly. The results of various experiments conducted on different big datasets show reasonable accuracy rates compared to state-of-the-art methods and the KNN classifier itself. More importantly, they show the high classification speed of both methods. This strong trait can be used to further improve the accuracy of the proposed methods.
Date: 2018
References: View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0207772 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 07772&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0207772
DOI: 10.1371/journal.pone.0207772
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().