SBNNR: Small-Size Bat-Optimized KNN Regression
Rasool Seyghaly (),
Jordi Garcia,
Xavi Masip-Bruin () and
Jovana Kuljanin
Additional contact information
Rasool Seyghaly: Advanced Network Architectures Laboratory (CRAAX), Universitat Politècnica de Catalunya (UPC) BarcelonaTECH, 08800 Vilanova, Spain
Jordi Garcia: Advanced Network Architectures Laboratory (CRAAX), Universitat Politècnica de Catalunya (UPC) BarcelonaTECH, 08800 Vilanova, Spain
Xavi Masip-Bruin: Advanced Network Architectures Laboratory (CRAAX), Universitat Politècnica de Catalunya (UPC) BarcelonaTECH, 08800 Vilanova, Spain
Jovana Kuljanin: Aeronautical Division, Universitat Politècnica de Catalunya BarcelonaTECH, 08034 Barcelona, Spain
Future Internet, 2024, vol. 16, issue 11, 1-20
Abstract:
Small datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. For this reason, in some cases, low-performance, overfitting models are developed for small-scale data. As a result, it appears necessary to develop methods for dealing with this type of data. In this research, we provide a new and innovative framework for regression problems with a small sample size. The base of our proposed method is the K-nearest neighbors (KNN) algorithm. For feature selection, instance selection, and hyperparameter tuning, we use the bat optimization algorithm (BA). Generative Adversarial Networks (GANs) are employed to generate synthetic data, effectively addressing the challenges associated with data sparsity. Concurrently, Deep Neural Networks (DNNs), as a deep learning approach, are utilized for feature extraction from both synthetic and real datasets. This hybrid framework integrates KNN, DNN, and GAN as foundational components and is optimized in multiple aspects (features, instances, and hyperparameters) using BA. The outcomes exhibit an enhancement of up to 5% in the coefficient of determination ( R 2 score) using the proposed method compared to the standard KNN method optimized through grid search.
Keywords: regression; K-nearest neighbor; bat algorithm; instance selection; feature selection (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1999-5903/16/11/422/pdf (application/pdf)
https://www.mdpi.com/1999-5903/16/11/422/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:16:y:2024:i:11:p:422-:d:1520775
Access Statistics for this article
Future Internet is currently edited by Ms. Grace You
More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().