EconPapers    
Economics at your fingertips  
 

Bayesian nonparametric disclosure risk assessment

Stefano Favaro, Francesca Panero and Tommaso Rigon

LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library

Abstract: Any decision about the release of microdata for public use is supported by the estimation of measures of disclosure risk, the most popular being the number τ1 of sample uniques that are also population uniques. In such a context, parametric and nonparametric partition-based models have been shown to have: i) the strength of leading to estimators of τ1 with desirable features, including ease of implementation, computational efficiency and scalability to massive data; ii) the weakness of producing underestimates of τ1 in realistic scenarios, with the underestimation getting worse as the tail behaviour of the empirical distribution of microdata gets heavier. To fix this underestimation phenomenon, we propose a Bayesian nonparametric partition-based model that can be tuned to the tail behaviour of the empirical distribution of microdata. Our model relies on the Pitman–Yor process prior, and it leads to a novel estimator of τ1 with all the desirable features of partition-based estimators and that, in addition, allows to reduce underestimation by tuning a “discount” parameter. We show the effectiveness of our estimator through its application to synthetic data and real data.

Keywords: Bayesian nonparametrics; data confidentiality; Dirichlet process prior; disclosure risk assessment; empirical Bayes; Pitman-Yor process prior (search for similar items in EconPapers)
JEL-codes: C1 (search for similar items in EconPapers)
Pages: 26 pages
Date: 2021-12-27
New Economics Papers: this item is included in nep-rmg
References: Add references at CitEc
Citations:

Published in Electronic Journal of Statistics, 27, December, 2021, 15(2), pp. 5626 - 5651. ISSN: 1935-7524

Downloads: (external link)
http://eprints.lse.ac.uk/117305/ Open access version. (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ehl:lserod:117305

Access Statistics for this paper

More papers in LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library LSE Library Portugal Street London, WC2A 2HD, U.K.. Contact information at EDIRC.
Bibliographic data for series maintained by LSERO Manager ().

 
Page updated 2024-12-28
Handle: RePEc:ehl:lserod:117305