A Simple Method for Limiting Disclosure in Continuous Microdata Based on Principal Component Analysis
Calviño Aida ()
Additional contact information
Calviño Aida: Department of Computer Science and Mathematics, Universitat Rovira i Virgili, 43007 Tarragona, Spain Spain
Journal of Official Statistics, 2017, vol. 33, issue 1, 15-41
Abstract:
In this article we propose a simple and versatile method for limiting disclosure in continuous microdata based on Principal Component Analysis (PCA). Instead of perturbing the original variables, we propose to alter the principal components, as they contain the same information but are uncorrelated, which permits working on each component separately, reducing processing times. The number and weight of the perturbed components determine the level of protection and distortion of the masked data. The method provides preservation of the mean vector and the variance-covariance matrix. Furthermore, depending on the technique chosen to perturb the principal components, the proposed method can provide masked, hybrid or fully synthetic data sets. Some examples of application and comparison with other methods previously proposed in the literature (in terms of disclosure risk and data utility) are also included.
Keywords: Statistical disclosure control; microdata protection; hybrid microdata; masking method; propensity score (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/jos-2017-0002 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:vrs:offsta:v:33:y:2017:i:1:p:15-41:n:2
DOI: 10.1515/jos-2017-0002
Access Statistics for this article
Journal of Official Statistics is currently edited by Annica Isaksson and Ingegerd Jansson
More articles in Journal of Official Statistics from Sciendo
Bibliographic data for series maintained by Peter Golla ().