Loss-optimal classification trees: a generalized framework and the logistic case
Tommaso Aldinucci () and
Matteo Lapucci ()
Additional contact information
Tommaso Aldinucci: University of Florence
Matteo Lapucci: University of Florence
TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, 2024, vol. 32, issue 2, No 7, 323-350
Abstract:
Abstract Classification trees are one of the most common models in interpretable machine learning. Although such models are usually built with greedy strategies, in recent years, thanks to remarkable advances in mixed-integer programming (MIP) solvers, several exact formulations of the learning problem have been developed. In this paper, we argue that some of the most relevant ones among these training models can be encapsulated within a general framework, whose instances are shaped by the specification of loss functions and regularizers. Next, we introduce a novel realization of this framework: specifically, we consider the logistic loss, handled in the MIP setting by a piece-wise linear approximation, and couple it with $$\ell _1$$ ℓ 1 -regularization terms. The resulting optimal logistic classification tree model numerically proves to be able to induce trees with enhanced interpretability properties and competitive generalization capabilities, compared to the state-of-the-art MIP-based approaches.
Keywords: Optimal classification trees; Logistic regression; Interpretability; Mixed-integer programming; 90C11; 90C26; 62-08 (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11750-024-00674-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:topjnl:v:32:y:2024:i:2:d:10.1007_s11750-024-00674-y
Ordering information: This journal article can be ordered from
http://link.springer.de/orders.htm
DOI: 10.1007/s11750-024-00674-y
Access Statistics for this article
TOP: An Official Journal of the Spanish Society of Statistics and Operations Research is currently edited by Juan José Salazar González and Gustavo Bergantiños
More articles in TOP: An Official Journal of the Spanish Society of Statistics and Operations Research from Springer, Sociedad de Estadística e Investigación Operativa
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().