On the properties of distance covariance for categorical data: Robustness, sure screening, and approximate null distributions
Qingyang Zhang
Scandinavian Journal of Statistics, 2025, vol. 52, issue 2, 777-804
Abstract:
Pearson's Chi‐squared test, though widely used for detecting association between categorical variables, exhibits low statistical power in large sparse contingency tables. To address this limitation, two novel permutation tests have been recently developed: The distance covariance permutation test and the U‐statistic permutation test. Both leverage the distance covariance functional but employ different estimators. In this work, we explore key statistical properties of the distance covariance for categorical variables. Firstly, we show that, unlike Chi‐squared, the distance covariance functional is B‐robust for any number of categories (fixed or diverging). Second, we establish the strong consistency of distance covariance screening under mild conditions, and simulations confirm its advantage over Chi‐squared screening, especially for large sparse tables. We illustrate this novel screening method using the General Social Survey data. Finally, we derive an approximate null distribution for a bias‐corrected distance correlation estimate, demonstrating its effectiveness through simulations and real‐world applications.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1111/sjos.12771
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:scjsta:v:52:y:2025:i:2:p:777-804
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0303-6898
Access Statistics for this article
Scandinavian Journal of Statistics is currently edited by ÿrnulf Borgan and Bo Lindqvist
More articles in Scandinavian Journal of Statistics from Danish Society for Theoretical Statistics, Finnish Statistical Society, Norwegian Statistical Association, Swedish Statistical Association
Bibliographic data for series maintained by Wiley Content Delivery ().