Privacy-Aware Table Data Generation by Adversarial Gradient Boosting Decision Tree
Shuai Jiang (),
Naoto Iwata,
Sayaka Kamei,
Kazi Md. Rokibul Alam and
Yasuhiko Morimoto
Additional contact information
Shuai Jiang: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan
Naoto Iwata: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan
Sayaka Kamei: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan
Kazi Md. Rokibul Alam: Department of Computer Science and Engineering, Khulna University of Engineering and Technology, Khulna 9203, Bangladesh
Yasuhiko Morimoto: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan
Mathematics, 2025, vol. 13, issue 15, 1-17
Abstract:
Privacy preservation poses significant challenges in third-party data sharing, particularly when handling table data containing personal information such as demographic and behavioral records. Synthetic table data generation has emerged as a promising solution to enable data analysis while mitigating privacy risks. While Generative Adversarial Networks (GANs) are widely used for this purpose, they exhibit limitations in modeling table data due to challenges in handling mixed data types (numerical/categorical), non-Gaussian distributions, and imbalanced variables. To address these limitations, this study proposes a novel adversarial learning framework integrating gradient boosting trees for synthesizing table data, called Adversarial Gradient Boosting Decision Tree (AGBDT). Experimental evaluations on several datasets demonstrate that our method outperforms representative baseline models regarding statistical similarity and machine learning utility. Furthermore, we introduce a privacy-aware adaptation of the framework by incorporating k -anonymization constraints, effectively reducing overfitting to source data while maintaining practical usability. The results validate the balance between data utility and privacy preservation achieved by our approach.
Keywords: adversarial learning; decision trees; tree ensembles; privacy evaluation (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/15/2509/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/15/2509/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:15:p:2509-:d:1717406
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().