On the consistency of a random forest algorithm in the presence of missing entries
Irving Gómez-Méndez and
Emilien Joly
Journal of Nonparametric Statistics, 2024, vol. 36, issue 2, 400-434
Abstract:
This paper tackles the problem of constructing a nonparametric predictor when the latent variables are given with incomplete information. The convenient predictor for this task is the random forest algorithm in conjunction to the so-called CART criterion. The proposed technique enables a partial imputation of the missing values in the data set in a way that suits both a consistent estimator of the regression function as well as a partial recovery of the missing values. The imputation is done through iterative assignation of the missing values to the tree's cells, maximising the CART criterion. A proof of the consistency of the random forest estimator is given in the case where each latent variable is missing completely at random (MCAR).
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/10485252.2023.2219783 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:gnstxx:v:36:y:2024:i:2:p:400-434
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/GNST20
DOI: 10.1080/10485252.2023.2219783
Access Statistics for this article
Journal of Nonparametric Statistics is currently edited by Jun Shao
More articles in Journal of Nonparametric Statistics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().