EconPapers    
Economics at your fingertips  
 

Discretization: Privacy-preserving data publishing for causal discovery

Youngmin Ahn, Woongjoon Park and Gunwoong Park

Computational Statistics & Data Analysis, 2025, vol. 209, issue C

Abstract: As the importance of data privacy continues to grow, data masking has emerged as a crucial method. Notably, data masking techniques aim to protect individual privacy, while enabling data analysts to derive meaningful statistical results, such as the identification of directional or causal relationships between variables. Hence, this study demonstrates the advantages of a quantile-based discretization for protecting privacy and uncovering the relationships between variables in Gaussian directed acyclic graphical (DAG) models. Specifically, it introduces quantile-discretized Gaussian DAG models where each node variable is discretized based on the quantiles. Additionally, it proposes the bi-partition process, which aids in recovering the covariance matrix; hence, the models can be identifiable. Furthermore, a consistent algorithm is developed for learning the underlying structure using the quantile-based discretized data. Finally, through numerical experiments and the application of DAG learning algorithms to discretized MLB data, the proposed algorithm is demonstrated to significantly outperform the state-of-the-art DAG model learning algorithms.

Keywords: Causal discovery; Discretization; Gaussian graphical models; Identifiability; Privacy; Robust learning (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947325000507
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:209:y:2025:i:c:s0167947325000507

DOI: 10.1016/j.csda.2025.108174

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-05-06
Handle: RePEc:eee:csdana:v:209:y:2025:i:c:s0167947325000507