Asymptotic Normality for Multivariate Random Forest Estimators
Kevin Li
Papers from arXiv.org
Abstract:
Regression trees and random forests are popular and effective non-parametric estimators in practical applications. A recent paper by Athey and Wager shows that the random forest estimate at any point is asymptotically Gaussian; in this paper, we extend this result to the multivariate case and show that the vector of estimates at multiple points is jointly normal. Specifically, the covariance matrix of the limiting normal distribution is diagonal, so that the estimates at any two points are independent in sufficiently deep trees. Moreover, the off-diagonal term is bounded by quantities capturing how likely two points belong to the same partition of the resulting tree. Our results relies on certain a certain stability property when constructing splits, and we give examples of splitting rules for which this assumption is and is not satisfied. We test our proposed covariance bound and the associated coverage rates of confidence intervals in numerical simulations.
Date: 2020-12, Revised 2021-01
New Economics Papers: this item is included in nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2012.03486 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2012.03486
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().