A New Alternating Suboptimal Dynamic Programming Algorithm with Applications for Feature Selection
David Podgorelec (),
Borut Žalik,
Domen Mongus and
Dino Vlahek
Additional contact information
David Podgorelec: Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroška Cesta 46, SI-2000 Maribor, Slovenia
Borut Žalik: Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroška Cesta 46, SI-2000 Maribor, Slovenia
Domen Mongus: Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroška Cesta 46, SI-2000 Maribor, Slovenia
Dino Vlahek: Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroška Cesta 46, SI-2000 Maribor, Slovenia
Mathematics, 2024, vol. 12, issue 13, 1-22
Abstract:
Feature selection is predominantly used in machine learning tasks, such as classification, regression, and clustering. It selects a subset of features (relevant attributes of data points) from a larger set that contributes as optimally as possible to the informativeness of the model. There are exponentially many subsets of a given set, and thus, the exhaustive search approach is only practical for problems with at most a few dozen features. In the past, there have been attempts to reduce the search space using dynamic programming. However, models that consider similarity in pairs of features alongside the quality of individual features do not provide the required optimal substructure. As a result, algorithms, which we will call suboptimal dynamic programming algorithms, find a solution that may deviate significantly from the optimal one. In this paper, we propose an iterative dynamic programming algorithm, which invertsthe order of feature processing in each iteration. Such an alternating approach allows for improving the optimization function by using the score from the previous iteration to estimate the contribution of unprocessed features. The iterative process is proven to converge and terminates when the solution does not change in three successive iterations or when the number of iterations reaches the threshold. Results in more than 95% of tests align with those of the exhaustive search approach, being competitive and often superior to the reference greedy approach. Validation was carried out by comparing the scores of output feature subsets and examining the accuracy of different classifiers learned on these features across nine real-world applications, considering different scenarios with various numbers of features and samples. In the context of feature selection, the proposed algorithm can be characterized as a robust filter method that can improve machine learning models regardless of dataset size. However, we expect that the idea of alternating suboptimal optimization will soon be generalized to tasks beyond feature selection.
Keywords: dynamic programming; suboptimal solution; feature selection; machine learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/13/1987/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/13/1987/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:13:p:1987-:d:1423691
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().