Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing
Sikha Bagui,
Arup Kumar Mondal and
Subhash Bagui
Additional contact information
Sikha Bagui: University of West Florida, Pensacola, USA
Arup Kumar Mondal: University of West Florida, Pensacola, USA
Subhash Bagui: University of West Florida, Pensacola, USA
International Journal of Distributed Systems and Technologies (IJDST), 2019, vol. 10, issue 4, 1-16
Abstract:
In this work the authors present a parallel k nearest neighbor (kNN) algorithm using locality sensitive hashing to preprocess the data before it is classified using kNN in Hadoop's MapReduce framework. This is compared with the sequential (conventional) implementation. Using locality sensitive hashing's similarity measure with kNN, the iterative procedure to classify a data object is performed within a hash bucket rather than the whole data set, greatly reducing the computation time needed for classification. Several experiments were run that showed that the parallel implementation performed better than the sequential implementation on very large datasets. The study also experimented with a few map and reduce side optimization features for the parallel implementation and presented some optimum map and reduce side parameters. Among the map side parameters, the block size and input split size were varied, and among the reduce side parameters, the number of planes were varied, and their effects were studied.
Date: 2019
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJDST.2019100101 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdst00:v:10:y:2019:i:4:p:1-16
Access Statistics for this article
International Journal of Distributed Systems and Technologies (IJDST) is currently edited by Nik Bessis
More articles in International Journal of Distributed Systems and Technologies (IJDST) from IGI Global
Bibliographic data for series maintained by Journal Editor ().