Data transformation techniques for preserving privacy in distance-based mining algorithms
Mohammad Ali Kadampur and
D.V.L.N. Somayajulu
International Journal of Data Mining, Modelling and Management, 2014, vol. 6, issue 3, 285-311
Abstract:
Dissimilarity calculation between two objects is one of the important knowledge gathering methods in cognition science. Many data mining algorithms explore dissimilarity computation to cluster the data in order to know intra-relations, inter-relations, and outliers in the data. Majority of these algorithms use Euclidean distance as the dissimilarity criterion. In this paper, signal transformation functions, with their orthogonal property and energy compaction features are explored in transforming the data. The data transformation scheme considers entire data as a single entity. The proposed scheme is designed such that it can be used even for the non-Euclidean space by using the distance mapping algorithm. The existing randomisation approaches for data transformation maintain only the distributions and do not maintain the Euclidean distance between the records. The proposed methods are superior to the existing methods in terms of run time complexity O(n) and preservation of distance between individual data points.
Keywords: privacy preservation; privacy protection; data perturbation; wavelet transforms; data mining; data transformation; distance-based mining; signal transformation functions; Euclidean distance. (search for similar items in EconPapers)
Date: 2014
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=65148 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:6:y:2014:i:3:p:285-311
Access Statistics for this article
More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().