EconPapers    
Economics at your fingertips  
 

A fast algorithm for computing distance correlation

Arin Chaudhuri and Wenhao Hu

Computational Statistics & Data Analysis, 2019, vol. 135, issue C, 15-24

Abstract: Classical dependence measures such as Pearson correlation, Spearman’s ρ, and Kendall’s τ can detect only monotonic or linear dependence. To overcome these limitations, Székely et al. proposed distance covariance and its derived correlation. The distance covariance is a weighted L2 distance between the joint characteristic function and the product of marginal distributions; it is 0 if and only if two random vectors X and Y are independent. This measure can detect the presence of a dependence structure when the sample size is large enough. They further showed that the sample distance covariance can be calculated simply from modified Euclidean distances, which typically requires O(n2) cost, where n is the sample size. Quadratic computing time greatly limits the use of the distance covariance for large data. To calculate the sample distance covariance between two univariate random variables, a simple, exact O(nlog(n)) algorithms is developed. The proposed algorithm essentially consists of two sorting steps, so it is easy to implement. Empirical results show that the proposed algorithm is significantly faster than state-of-the-art methods. The algorithm’s speed will enable researchers to explore complicated dependence structures in large datasets.

Keywords: Distance correlation; Dependency measure; Fast algorithm; Merge sort (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947319300313
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:135:y:2019:i:c:p:15-24

DOI: 10.1016/j.csda.2019.01.016

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:135:y:2019:i:c:p:15-24