EconPapers    
Economics at your fingertips  
 

Fast multivariate empirical cumulative distribution function with connection to kernel density estimation

Nicolas Langrené and Xavier Warin

Computational Statistics & Data Analysis, 2021, vol. 162, issue C

Abstract: The problem of computing empirical cumulative distribution functions (ECDF) efficiently on large, multivariate datasets, is revisited. Computing an ECDF at one evaluation point requires O(N) operations on a dataset composed of N data points. Therefore, a direct evaluation of ECDFs at N evaluation points requires a quadratic O(N2) operations, which is prohibitive for large-scale problems. Two fast and exact methods are proposed and compared. The first one is based on fast summation in lexicographical order, with a O(Nlog⁡N) complexity and requires the evaluation points to lie on a regular grid. The second one is based on the divide-and-conquer principle, with a O(Nlog⁡(N)(d−1)∨1) complexity and requires the evaluation points to coincide with the input points. The two fast algorithms are described and detailed in the general d-dimensional case, and numerical experiments validate their speed and accuracy. Secondly, a direct connection between cumulative distribution functions and kernel density estimation (KDE) is established for a large class of kernels. This connection paves the way for fast exact algorithms for multivariate kernel density estimation and kernel regression. Numerical tests with the Laplacian kernel validate the speed and accuracy of the proposed algorithms. A broad range of large-scale multivariate density estimation, cumulative distribution estimation, survival function estimation and regression problems can benefit from the proposed numerical methods.

Keywords: Fast CDF; Fast KDE; Empirical distribution function; Survival function; Nonparametric copula estimation; Fast kernel summation (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947321001018
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:162:y:2021:i:c:s0167947321001018

DOI: 10.1016/j.csda.2021.107267

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:162:y:2021:i:c:s0167947321001018