A Probabilistic Perspective on Re-Identifiability
Matthijs Koot,
Michel Mandjes,
Guido van 't Noordende and
Cees de Laat
Mathematical Population Studies, 2013, vol. 20, issue 3, 155-171
Abstract:
A quasi-identifier is a set of attributes that can be used to re-identify entries in anonymized data sets. A group of individuals is considered about whom quasi-identifying numerical information is disclosed such as date of birth, age, weight, and height. The fraction of individuals is determined whose information is unique in that group and hence is identifiable unambiguously. Nonuniformity can be captured well by a single number, the Kullback-Leibler distance. For example sets of real microdata, given approximations based on Kullback-Leibler distances are accurate. Second, the effect of disclosing more specific or less specific information is analyzed experimentally. Third, the effect of correlation between numerical attributes is measured. A formula gives the re-identifiability level. The approximations are validated using publicly available demographic data sets.
Date: 2013
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/08898480.2013.816222 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:mpopst:v:20:y:2013:i:3:p:155-171
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/GMPS20
DOI: 10.1080/08898480.2013.816222
Access Statistics for this article
Mathematical Population Studies is currently edited by Prof. Noel Bonneuil, Annick Lesne, Tomasz Zadlo, Malay Ghosh and Ezio Venturino
More articles in Mathematical Population Studies from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().