Privacy-Aware Table Data Generation by Adversarial Gradient Boosting Decision Tree

Jiang, Shuai; Iwata, Naoto; Kamei, Sayaka; Alam, Kazi Md. Rokibul; Morimoto, Yasuhiko

Privacy-Aware Table Data Generation by Adversarial Gradient Boosting Decision Tree

Shuai Jiang (), Naoto Iwata, Sayaka Kamei, Kazi Md. Rokibul Alam and Yasuhiko Morimoto
Additional contact information
Shuai Jiang: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan
Naoto Iwata: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan
Sayaka Kamei: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan
Kazi Md. Rokibul Alam: Department of Computer Science and Engineering, Khulna University of Engineering and Technology, Khulna 9203, Bangladesh
Yasuhiko Morimoto: Graduate School of Advanced Science and Engineering, Hiroshima University, Kagamiyama 1-7-1, Higashi-Hiroshima 739-8521, Japan

Mathematics, 2025, vol. 13, issue 15, 1-17

Abstract: Privacy preservation poses significant challenges in third-party data sharing, particularly when handling table data containing personal information such as demographic and behavioral records. Synthetic table data generation has emerged as a promising solution to enable data analysis while mitigating privacy risks. While Generative Adversarial Networks (GANs) are widely used for this purpose, they exhibit limitations in modeling table data due to challenges in handling mixed data types (numerical/categorical), non-Gaussian distributions, and imbalanced variables. To address these limitations, this study proposes a novel adversarial learning framework integrating gradient boosting trees for synthesizing table data, called Adversarial Gradient Boosting Decision Tree (AGBDT). Experimental evaluations on several datasets demonstrate that our method outperforms representative baseline models regarding statistical similarity and machine learning utility. Furthermore, we introduce a privacy-aware adaptation of the framework by incorporating k -anonymization constraints, effectively reducing overfitting to source data while maintaining practical usability. The results validate the balance between data utility and privacy preservation achieved by our approach.

Keywords: adversarial learning; decision trees; tree ensembles; privacy evaluation (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/15/2509/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/15/2509/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:15:p:2509-:d:1717406

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().