Learning causal graphs using variable grouping according to ancestral relationship

Cai, Ming; Hara, Hisayuki

Learning causal graphs using variable grouping according to ancestral relationship

Ming Cai () and Hisayuki Hara ()
Additional contact information
Ming Cai: Kyoto University
Hisayuki Hara: Kyoto University

Computational Statistics, 2025, vol. 40, issue 7, No 21, 3947-3979

Abstract: Abstract When the sample size is small relative to the number of variables, the accuracy of the conventional causal learning algorithm decreases. Some causal discovery methods are not feasible when the sample size is smaller than the number of variables. To circumvent these problems, some researchers proposed causal discovery algorithms using divide-and-conquer approaches (e.g., Cai et al. in Sada: a general framework to support robust causation discovery. In: International Conference on machine learning, PMLR, pp 208–216, 2013; Zhang et al. in IEEE Trans Cybern 52:3232–3243, 2020). For learning an entire causal graph, divide-and-conquer approaches first split variables into several subsets according to the conditional independence relationships among the variables, then apply a conventional causal discovery algorithm to each subset and merge the estimated results. Since the divide-and-conquer approach reduces the number of variables to which a causal discovery algorithm is applied, it is expected to improve the estimation accuracy, especially when the sample size is small relative to the number of variables and the model is sparse. However, existing methods are computationally expensive or do not provide sufficient accuracy when the sample size is small. This paper proposes a new algorithm for grouping variables according to the causal ancestral relationships, assuming that the causal model is LiNGAM (Shimizu et al. J Mach Learn Res 7:2003–2030, 2006). We call the proposed algorithm the causal ancestral-relationship-based grouping (CAG). The time complexity of the ancestor finding in the CAG is shown to be cubic in the number of variables. Extensive computer experiments confirm that the proposed method outperforms the original DirectLiNGAM (Shimizu et al. in J Mach Learn Res-JMLR 12:1225–1248, 2011) and other divide-and-conquer approaches not only in estimation accuracy but also in computation time when the sample size is small relative to the number of variables and the causal model is sparse or moderately dense. We also apply the proposed method to two real datasets to confirm its usefulness.

Keywords: Causal discovery; Causal DAG; Conditional independence test; DirectLiNGAM; Divide-and-conquer; Linear regression (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s00180-025-01633-2 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:40:y:2025:i:7:d:10.1007_s00180-025-01633-2

Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2

DOI: 10.1007/s00180-025-01633-2

Access Statistics for this article

Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik

More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().