SUBiNN: a stacked uni- and bivariate kNN sparse ensemble
Tiffany Elsten and
Mark Rooij ()
Additional contact information
Tiffany Elsten: Leiden University
Mark Rooij: Leiden University
Advances in Data Analysis and Classification, 2022, vol. 16, issue 4, No 3, 847-874
Abstract:
Abstract Nearest Neighbor classification is an intuitive distance-based classification method. It has, however, two drawbacks: (1) it is sensitive to the number of features, and (2) it does not give information about the importance of single features or pairs of features. In stacking, a set of base-learners is combined in one overall ensemble classifier by means of a meta-learner. In this manuscript we combine univariate and bivariate nearest neighbor classifiers that are by itself easily interpretable. Furthermore, we combine these classifiers by a Lasso method that results in a sparse ensemble of nonlinear main and pairwise interaction effects. We christened the new method SUBiNN: Stacked Uni- and Bivariate Nearest Neighbors. SUBiNN overcomes the two drawbacks of simple nearest neighbor methods. In extensive simulations and using benchmark data sets, we evaluate the predictive performance of SUBiNN and compare it to other nearest neighbor ensemble methods as well as Random Forests and Support Vector Machines. Results indicate that SUBiNN often outperforms other nearest neighbor methods, that SUBiNN is well capable of identifying noise features, but that Random Forests is often, but not always, the best classifier.
Keywords: Classification; Distance; Ensemble methods; Lasso regression; Nearest neighbors; 62H30 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11634-021-00462-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:16:y:2022:i:4:d:10.1007_s11634-021-00462-7
Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2
DOI: 10.1007/s11634-021-00462-7
Access Statistics for this article
Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs
More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().