EconPapers    
Economics at your fingertips  
 

Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing

Sikha Bagui, Arup Kumar Mondal and Subhash Bagui
Additional contact information
Sikha Bagui: University of West Florida, Pensacola, USA
Arup Kumar Mondal: University of West Florida, Pensacola, USA
Subhash Bagui: University of West Florida, Pensacola, USA

International Journal of Distributed Systems and Technologies (IJDST), 2019, vol. 10, issue 4, 1-16

Abstract: In this work the authors present a parallel k nearest neighbor (kNN) algorithm using locality sensitive hashing to preprocess the data before it is classified using kNN in Hadoop's MapReduce framework. This is compared with the sequential (conventional) implementation. Using locality sensitive hashing's similarity measure with kNN, the iterative procedure to classify a data object is performed within a hash bucket rather than the whole data set, greatly reducing the computation time needed for classification. Several experiments were run that showed that the parallel implementation performed better than the sequential implementation on very large datasets. The study also experimented with a few map and reduce side optimization features for the parallel implementation and presented some optimum map and reduce side parameters. Among the map side parameters, the block size and input split size were varied, and among the reduce side parameters, the number of planes were varied, and their effects were studied.

Date: 2019
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJDST.2019100101 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jdst00:v:10:y:2019:i:4:p:1-16

Access Statistics for this article

International Journal of Distributed Systems and Technologies (IJDST) is currently edited by Nik Bessis

More articles in International Journal of Distributed Systems and Technologies (IJDST) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jdst00:v:10:y:2019:i:4:p:1-16