Privacy-Preserving Hybrid K-Means
Zhiqiang Gao,
Yixiao Sun,
Xiaolong Cui,
Yutao Wang,
Yanyu Duan and
Xu An Wang
Additional contact information
Zhiqiang Gao: Engineering University of PAP, Xian, China
Yixiao Sun: Department of Information Engineering, Official College of PAP, Chengdu, China
Xiaolong Cui: Engineering University of PAP, Xian, China
Yutao Wang: Department of Information Engineering, Engineering University of PAP, Xian, China
Yanyu Duan: Department of Information Engineering, Engineering University of PAP, Xian, China
Xu An Wang: Engineering University of PAP, Xian, China
International Journal of Data Warehousing and Mining (IJDWM), 2018, vol. 14, issue 2, 1-17
Abstract:
This article describes how the most widely used clustering, k-means, is prone to fall into a local optimum. Notably, traditional clustering approaches are directly performed on private data and fail to cope with malicious attacks in massive data mining tasks against attackers' arbitrary background knowledge. It would result in violation of individuals' privacy, as well as leaks through system resources and clustering outputs. To address these issues, the authors propose an efficient privacy-preserving hybrid k-means under Spark. In the first stage, particle swarm optimization is executed in resilient distributed datasets to initiate the selection of clustering centroids in the k-means on Spark. In the second stage, k-means is executed on the condition that a privacy budget is set as ε/2t with Laplace noise added in each round of iterations. Extensive experimentation on public UCI data sets show that on the premise of guaranteeing utility of privacy data and scalability, their approach outperforms the state-of-the-art varieties of k-means by utilizing swarm intelligence and rigorous paradigms of differential privacy.
Date: 2018
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJDWM.2018040101 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:14:y:2018:i:2:p:1-17
Access Statistics for this article
International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede
More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().