Governing Synthetic Data in the Financial Sector
Taylor C. Spears,
Kristian Bondo Hansen,
Ruowen Xu and
Yuval Millo
Additional contact information
Taylor C. Spears: University of Edinburgh
No ruxkh_v1, SocArXiv from Center for Open Science
Abstract:
Synthetic datasets, artificially generated to mimic real-world data while maintaining anonymization, have emerged as a promising technology in the financial sector, attracting support from regulators and market participants as a solution to data privacy and scarcity challenges limiting machine learning deployment. This paper argues that synthetic data's effects on financial markets depend critically on how these technologies are embedded within existing machine learning infrastructural ``stacks'' rather than on their intrinsic properties. We identify three key tensions that will determine whether adoption proves beneficial or harmful: (1) data circulability versus opacity, particularly the "double opacity" problem arising from stacked machine learning systems, (2) model-induced scattering versus model-induced herding in market participant behaviour, and (3) flattening versus deepening of data platform power. These tensions directly correspond to core regulatory priorities around model risk management, systemic risk, and competition policy. Using financial audit as a case study, we demonstrate how these tensions interact in practice and propose governance frameworks, including a synthetic data labelling regime to preserve contextual information when datasets cross organizational boundaries.
Date: 2025-09-08
New Economics Papers: this item is included in nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://osf.io/download/68baa89d28a718439fa6c109/
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:osf:socarx:ruxkh_v1
DOI: 10.31219/osf.io/ruxkh_v1
Access Statistics for this paper
More papers in SocArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().