EconPapers    
Economics at your fingertips  
 

Identification, data combination and the risk of disclosure

Tatiana V. Komarova (), Denis Nekipelov () and Evgeny Yakovlev
Additional contact information
Tatiana V. Komarova: Institute for Fiscal Studies and London School of Economics and Political Science

No CWP38/11, CeMMAP working papers from Centre for Microdata Methods and Practice, Institute for Fiscal Studies

Abstract: Businesses routinely rely on econometric models to analyze and predict consumer behavior. Estimation of such models may require combining a firm's internal data with external datasets to take into account sample selection, missing observations, omitted variables and errors in measurement within the existing data source. In this paper we point out that these data problems can be addressed when estimating econometric models from combined data using the data mining techniques under mild assumptions regarding the data distribution. However, data combination leads to serious threats to security of consumer data: we demonstrate that point identification of an econometric model from combined data is incompatible with restrictions on the risk of individual disclosure. Consequently, if a consumer model is point identified, the firm would (implicitly or explicitly) reveal the identity of at least some of consumers in its internal data. More importantly, we provide an argument that unless the firm places a restriction on the individual disclosure risk when combining data, even if the raw combined dataset is not shared with a third party, an adversary or a competitor can gather confidential information regarding some individuals from the estimated model.

Date: 2011-12-20
New Economics Papers: this item is included in nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1) Track citations by RSS feed

Downloads: (external link)
http://cemmap.ifs.org.uk/wps/cwp3811.pdf (application/pdf)

Related works:
Journal Article: Identification, data combination, and the risk of disclosure (2018) Downloads
Working Paper: Identification, data combination and the risk of disclosure (2018) Downloads
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ifs:cemmap:38/11

Ordering information: This working paper can be ordered from
The Institute for Fiscal Studies 7 Ridgmount Street LONDON WC1E 7AE

Access Statistics for this paper

More papers in CeMMAP working papers from Centre for Microdata Methods and Practice, Institute for Fiscal Studies The Institute for Fiscal Studies 7 Ridgmount Street LONDON WC1E 7AE. Contact information at EDIRC.
Bibliographic data for series maintained by Emma Hyman ().

 
Page updated 2020-02-26
Handle: RePEc:ifs:cemmap:38/11