EconPapers    
Economics at your fingertips  
 

Maximizing Accuracy of Shared Databases when Concealing Sensitive Patterns

Syam Menon (), Sumit Sarkar () and Shibnath Mukherjee ()
Additional contact information
Syam Menon: School of Management, University of Texas at Dallas, Richardson, Texas 75083
Sumit Sarkar: School of Management, University of Texas at Dallas, Richardson, Texas 75083
Shibnath Mukherjee: School of Management, University of Texas at Dallas, Richardson, Texas 75083

Information Systems Research, 2005, vol. 16, issue 3, 256-270

Abstract: The sharing of databases either within or across organizations raises the possibility of unintentionally revealing sensitive relationships contained in them. Recent advances in data-mining technology have increased the chances of such disclosure. Consequently, firms that share their databases might choose to hide these sensitive relationships prior to sharing. Ideally, the approach used to hide relationships should be impervious to as many data-mining techniques as possible, while minimizing the resulting distortion to the database. This paper focuses on frequent item sets, the identification of which forms a critical initial step in a variety of data-mining tasks. It presents an optimal approach for hiding sensitive item sets, while keeping the number of modified transactions to a minimum. The approach is particularly attractive as it easily handles databases with millions of transactions. Results from extensive tests conducted on publicly available real data and data generated using IBM’s synthetic data generator indicate that the approach presented is very effective, optimally solving problems involving millions of transactions in a few seconds.

Keywords: data quality; privacy; item set mining (search for similar items in EconPapers)
Date: 2005
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (10)

Downloads: (external link)
http://dx.doi.org/10.1287/isre.1050.0056 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:orisre:v:16:y:2005:i:3:p:256-270

Access Statistics for this article

More articles in Information Systems Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:orisre:v:16:y:2005:i:3:p:256-270