EconPapers    
Economics at your fingertips  
 

Protecting Privacy Against Record Linkage Disclosure: A Bounded Swapping Approach for Numeric Data

Xiao-Bai Li () and Sumit Sarkar ()
Additional contact information
Xiao-Bai Li: Department of Operations and Information Systems, University of Massachusetts Lowell, Lowell, Massachusetts 01854
Sumit Sarkar: School of Management, University of Texas at Dallas, Richardson, Texas 75080

Information Systems Research, 2011, vol. 22, issue 4, 774-789

Abstract: Record linkage techniques have been widely used in areas such as antiterrorism, crime analysis, epidemiologic research, and database marketing. On the other hand, such techniques are also being increasingly used for identity matching that leads to the disclosure of private information. These techniques can be used to effectively reidentify records even in deidentified data. Consequently, the use of such techniques can lead to individual privacy being severely eroded. Our study addresses this important issue and provides a solution to resolve the conflict between privacy protection and data utility. We propose a data-masking method for protecting private information against record linkage disclosure that preserves the statistical properties of the data for legitimate analysis. Our method recursively partitions a data set into smaller subsets such that data records within each subset are more homogeneous after each partition. The partition is made orthogonal to the maximum variance dimension represented by the first principal component in each partitioned set. The attribute values of a record in a subset are then masked using a double-bounded swapping method. The proposed method, which we call multivariate swapping trees , is nonparametric in nature and does not require any assumptions about statistical distributions of the original data. Experiments conducted on real-world data sets demonstrate that the proposed approach significantly outperforms existing methods in terms of both preventing identity disclosure and preserving data quality.

Keywords: privacy; record linkage; data partitioning; data swapping (search for similar items in EconPapers)
Date: 2011
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (9)

Downloads: (external link)
http://dx.doi.org/10.1287/isre.1100.0289 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:orisre:v:22:y:2011:i:4:p:774-789

Access Statistics for this article

More articles in Information Systems Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:orisre:v:22:y:2011:i:4:p:774-789