EconPapers    
Economics at your fingertips  
 

Bayesian estimation of disclosure risks for synthetic time-to-event data

Sigrid Leithe
Additional contact information
Sigrid Leithe: Cancer Registry of Norway–Norwegian Institute of Public Health

Northern European Stata Conference 2024 from Stata Users Group

Abstract: Introduction: Generation of synthetic patient records can preserve the structure and statistical properties of the original data while maintaining privacy, providing access to high-quality data for research and innovation. Few synthesization methods account for the censoring mechanisms in time-to-event data, and formal privacy evaluations are often lacking. Improvements in synthetic data utility come with increased risks of privacy disclosure, necessitating a careful evaluation to obtain the proper balance. Methods: We generate synthetic time-to-event data based on colon cancer data from the Cancer Registry of Norway, using a sequence of conditional regression models and flexible parametric modeling of event times. Different levels of model complexity are used to investigate the impact on data utility and disclosure risk. The privacy risk is evaluated using Bayesian estimation of disclosure risks, which form the basis for a differential privacy audit. Results: Including more interaction terms and increasing degrees of freedom improves synthetic data utility and elevates privacy risks. While certain interactions substantially improve utility, others reduce privacy without much utility gain. The most complex model displays near-optimal utility scores. Conclusions: The results demonstrated a clear tradeoff between synthetic data utility and privacy risks. Interestingly, the relationship is nonlinear, because certain modeling choices increase synthetic data utility with little privacy loss, and vice versa.

References: Add references at CitEc
Citations:

Downloads: (external link)
http://repec.org/neur2024/Northern_Europe24_Leithe.pdf presentation materials (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:boc:neur24:10

Access Statistics for this paper

More papers in Northern European Stata Conference 2024 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().

 
Page updated 2025-03-19
Handle: RePEc:boc:neur24:10