Boosting interaction tree stumps for modeling interactions
Michael Lau,
Tamara Schikowski and
Holger Schwender
Computational Statistics & Data Analysis, 2026, vol. 213, issue C
Abstract:
Incorporating interaction effects is essential for accurately modeling complex underlying relationships in many applications. Often, not only strong predictive performance is desired, but also the interpretability of the resulting model. This need is evident in areas such as epidemiology, in which uncovering the interplay of biological mechanisms is critical for understanding complex diseases. Classical linear models, frequently used for constructing genetic risk scores, fail to capture interaction effects autonomously, while modern machine learning methods such as gradient boosting often produce black-box models that lack interpretability. Existing linear interaction models are largely limited to consider two-way interactions. To address these limitations, a novel statistical learning method, BITS (Boosting Interaction Tree Stumps), is introduced to construct linear models while autonomously detecting and incorporating interaction effects. BITS uses gradient boosting on interaction tree stumps, i.e., decision trees with a single split, where in BITS this split can possibly occur on an interaction term. A branch-and-bound approach is employed in BITS to discard weakly predictive terms. For high-dimensional data, a hybrid search strategy combining greedy and exhaustive approaches is proposed. Regularization techniques are integrated to prevent overfitting and the inclusion of spurious interaction effects. Simulation studies and real data applications demonstrate that BITS produces interpretable models with strong predictive performance. Moreover, in the simulation study, BITS primarily identifies truly influential terms.
Keywords: Linear interaction models; Penalized regression; Machine learning; Polygenic risk scores (search for similar items in EconPapers)
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947325001239
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:213:y:2026:i:c:s0167947325001239
DOI: 10.1016/j.csda.2025.108247
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().