Establishing a FAIR, CARE, and Efficient Synthetic Health Data Sharing Ecosystem for Canada
Helen Chen,
Maura R. Grossman,
Anindya Sen and
Shu-Feng Tsao
Additional contact information
Helen Chen: School of Public Health Sciences, University of Waterloo
Maura R. Grossman: School of Public Health Sciences and Cheriton School of Computer Science, University of Waterloo
Anindya Sen: Department of Economics, University of Waterloo
Shu-Feng Tsao: School of Public Health Sciences, University of Waterloo
Authors registered in the RePEc Author Service: Matthew Doyle
No 24002, Working Papers from University of Waterloo, Department of Economics
Abstract:
Obtaining access to real-world health data is a significant challenge, mainly due to privacy and security implications. Consequently, researchers and technology innovators ̶ particularly those operating in the health data science and AI technology development spaces – increasingly resort to synthetic health data to bridge the data gap. High-quality synthetic data has the potential to expedite research and development of novel technologies. However, synthetic health datasets in Canada are scarce, and no existing synthetic health datasets conform to the Findable, Accessible, Interoperable, and Reusable (FAIR) standards. Moreover, while federated machine learning offers the advantage of protecting patient privacy by not requiring the exchange of source data across nodes, it has yet to be optimized in Canada’s health research environment, and there is limited use of federated learning with synthetic health data. This paper explores the ethical considerations and value proposition of generating and sharing synthetic health data. Our goal is to facilitate the development of a reliable and sustainable synthetic data infrastructure that supports the ethical, responsible, and efficient use of synthetic health data. An important contribution of this research is the establishment of a framework that balances the social benefits of innovation from data sharing with the social costs that occur when individual privacy is compromised. The use of synthetic data significantly reduces the potential for individual harm and is a cost-effective means to lower datasharing costs. We believe that this framework will pave the way for a more robust and secure synthetic data ecosystem, enabling the generation of valuable insights that can drive positive health outcomes for Canadians.
Date: 2024-02
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wat:wpaper:24002
Access Statistics for this paper
More papers in Working Papers from University of Waterloo, Department of Economics Contact information at EDIRC.
Bibliographic data for series maintained by Sherri Anne Arsenault (saarsena@uwaterloo.ca).