EconPapers    
Economics at your fingertips  
 

Two-Stage Mining of Linkage Risk for Data Release

Runshan Hu, Yuanguo Lin, Mu Yang, Yuanhui Yu () and Vladimiro Sassone
Additional contact information
Runshan Hu: School of Computer Engineering, Jimei University, Xiamen 361021, China
Yuanguo Lin: School of Computer Engineering, Jimei University, Xiamen 361021, China
Mu Yang: Birkbeck, University of London, London WC1E 7HX, UK
Yuanhui Yu: School of Computer Engineering, Jimei University, Xiamen 361021, China
Vladimiro Sassone: School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK

Mathematics, 2025, vol. 13, issue 17, 1-30

Abstract: Privacy risk mining, a crucial domain in data privacy protection, endeavors to uncover potential information among datasets that could be linked to individuals’ sensitive data. Existing anonymization and privacy assessment techniques either lack quantitative granularity or fail to adapt to dynamic, heterogeneous data environments. In this work, we propose a unified two-phase linkability quantification framework that systematically measures privacy risks at both the inter-dataset and intra-dataset levels. Our approach integrates unsupervised clustering on attribute distributions with record-level matching to compute interpretable, fine-grained risk scores. By aligning risk measurement with regulatory standards such as the GDPR, our framework provides a practical, scalable solution for safeguarding user privacy in evolving data-sharing ecosystems. Extensive experiments on real-world and synthetic datasets show that our method achieves up to 96.7% precision in identifying true linkage risks, outperforming the compared baseline by 13 percentage points under identical experimental settings. Ablation studies further demonstrate that the hierarchical risk fusion strategy improves sensitivity to latent vulnerabilities, providing more actionable insights than previous privacy gain-based metrics.

Keywords: privacy risk mining; linkability quantification; unsupervised clustering; GDPR compliance; heterogeneous data analysis (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/17/2731/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/17/2731/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:17:p:2731-:d:1732276

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-10-04
Handle: RePEc:gam:jmathe:v:13:y:2025:i:17:p:2731-:d:1732276