To Bag is to Prune
Goulet Coulombe Philippe ()
Additional contact information
Goulet Coulombe Philippe: Université du Québec Á Montréal, Montréal, Canada
Studies in Nonlinear Dynamics & Econometrics, 2025, vol. 29, issue 6, 669-697
Abstract:
It is notoriously difficult to build a bad Random Forest (RF). Concurrently, RF blatantly overfits in-sample without apparent consequences out-of-sample. Arguments like the bias-variance trade-off or double descent cannot rationalize this paradox. I propose a new explanation: bootstrap aggregation and model perturbation as implemented by RF automatically prune a latent “true” tree. More generally, I document that randomized ensembles of greedily optimized learners implicitly perform optimal early stopping out-of-sample. So, letting RF overfit the training data is a dominant tuning strategy against nature’s undisclosed choice of noise level. Additionally, novel ensembles of Boosting and MARS are also eligible. I empirically demonstrate the property, with simulated and real data, by reporting that these new completely overfitting ensembles perform similarly to their tuned counterparts – or better.
Keywords: random forest; overfitting; tuning; greedy optimization (search for similar items in EconPapers)
JEL-codes: C45 C52 C53 C63 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/snde-2023-0030 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sndecm:v:29:y:2025:i:6:p:669-697:n:1002
Ordering information: This journal article can be ordered from
https://www.degruyte ... ournal/key/snde/html
DOI: 10.1515/snde-2023-0030
Access Statistics for this article
Studies in Nonlinear Dynamics & Econometrics is currently edited by Bruce Mizrach
More articles in Studies in Nonlinear Dynamics & Econometrics from De Gruyter
Bibliographic data for series maintained by Peter Golla ().