Balancing the privacy-utility tradeoff for synthetic time-to-event data
Sigrid Leithe
Additional contact information
Sigrid Leithe: Cancer Registry of Norway
Biostatistics and Epidemiology Virtual Symposium 2025 from Stata Users Group
Abstract:
Generation of synthetic patient records can preserve the structure and statistical properties of the original data without violating privacy, providing access to high-quality data for research and innovation. Few synthetization methods account for the censoring mechanism in time-to-event data, and formal privacy risk evaluations are often lacking. Improvements in synthetic data utility come with increased risks of privacy disclosure, necessitating a careful evaluation to obtai n the proper balance. In this talk, I will demonstrate a method for generating synthetic time-to-event data based on regression models and a flexible parametric survival model in Stata. I show how to evaluate the synthetic data utility and present a method for estimating the privacy loss from publishing a synthetic dataset.
Date: 2025-03-05
References: Add references at CitEc
Citations:
Downloads: (external link)
http://repec.org/biep2025/Bio25_Leithe.pdf presentation materials (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:biep25:01
Access Statistics for this paper
More papers in Biostatistics and Epidemiology Virtual Symposium 2025 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().