EconPapers    
Economics at your fingertips  
 

RETRACTED ARTICLE: Innovative study on clustering center and distance measurement of K-means algorithm: mapreduce efficient parallel algorithm based on user data of JD mall

Yang Liu (), Xinxin Du and Shuaifeng Ma
Additional contact information
Yang Liu: Southwestern University of Finance and Economics
Xinxin Du: Jingdong Century Trading Co., Ltd
Shuaifeng Ma: Jingdong Century Trading Co., Ltd

Electronic Commerce Research, 2023, vol. 23, issue 1, No 3, 43-73

Abstract: Abstract The traditional K-means algorithm is very sensitive to the selection of clustering centers and the calculation of distances, so the algorithm easily converges to a locally optimal solution. In addition, the traditional algorithm has slow convergence speed and low clustering accuracy, as well as memory bottleneck problems when processing massive data. Therefore, an improved K-means algorithm is proposed in this paper. In this algorithm, the selection of the initial points in the traditional clustering algorithm is improved first, and then a new global measure, the effective distance measure, is proposed. Its main idea is to calculate the effective distance between two data samples by sparse reconstruction. Finally, on the basis of the MapReduce framework, the efficiency of the algorithm is further improved by adjusting the Hadoop cluster. Based on the real customer data from the JD Mall dataset, this paper introduces the DBI, Rand and other indicators to evaluate the clustering effects of various algorithms. The results show that the proposed algorithm not only has good convergence and accuracy but also achieves better performances than those of other compared algorithms.

Keywords: K-means; Clustering center; Distance measurement; MapReduce; Parallel computing (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10660-021-09458-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:elcore:v:23:y:2023:i:1:d:10.1007_s10660-021-09458-z

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10660

DOI: 10.1007/s10660-021-09458-z

Access Statistics for this article

Electronic Commerce Research is currently edited by James Westland

More articles in Electronic Commerce Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:elcore:v:23:y:2023:i:1:d:10.1007_s10660-021-09458-z