Active learning of constraints for weighted feature selection
Samah Hijazi (),
Denis Hamad (),
Mariam Kalakech and
Ali Kalakech
Additional contact information
Samah Hijazi: Université du Littoral Côte d’Opale
Denis Hamad: Université du Littoral Côte d’Opale
Advances in Data Analysis and Classification, 2021, vol. 15, issue 2, No 5, 337-377
Abstract:
Abstract Pairwise constraints, a cheaper kind of supervision information that does not need to reveal the class labels of data points, were initially suggested to enhance the performance of clustering algorithms. Recently, researchers were interested in using them for feature selection. However, in most current methods, pairwise constraints are provided passively and generated randomly over multiple algorithmic runs by which the results are averaged. This leads to the need of a large number of constraints that might be redundant, unnecessary, and under some circumstances even inimical to the algorithm’s performance. It also masks the individual effect of each constraint set and introduces a human labor-cost burden. Therefore, in this paper, we suggest a framework for actively selecting and then propagating constraints for feature selection. For that, we benefit from the graph Laplacian that is defined on the similarity matrix. We assume that when a small perturbation of the similarity value between a data couple leads to a more well-separated cluster indicator based on the second eigenvector of the graph Laplacian, this couple is definitely expected to be a pairwise query of higher and more significant impact. Constraints propagation on the other side ensures increasing supervision information while decreasing the cost of human-labor. Finally, experimental results validated our proposal in comparison to other known feature selection methods and proved to be prominent.
Keywords: Feature selection; Active learning; Pairwise constraint selection; Constraint propagation; Graph Laplacian; Uncertainty reduction; Matrix perturbation; 15A18; 65F15; 62H30; 68T10; 47N10 (search for similar items in EconPapers)
Date: 2021
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11634-020-00408-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:15:y:2021:i:2:d:10.1007_s11634-020-00408-5
Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2
DOI: 10.1007/s11634-020-00408-5
Access Statistics for this article
Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs
More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().