A competitive optimization approach for data clustering and orthogonal non-negative matrix factorization
Ja’far Dehghanpour-Sahron () and
Nezam Mahdavi-Amiri ()
Additional contact information
Ja’far Dehghanpour-Sahron: Sharif University of Technology
Nezam Mahdavi-Amiri: Sharif University of Technology
4OR, 2021, vol. 19, issue 4, No 1, 473-499
Abstract:
Abstract Partitioning a given data-set into subsets based on similarity among the data is called clustering. Clustering is a major task in data mining and machine learning having many applications such as text retrieval, pattern recognition, and web mining. Here, we briefly review some clustering related problems (k-means, normalized k-cut, orthogonal non-negative matrix factorization, ONMF, and isoperimetry) and describe their connections. We formulate the relaxed mean version of the isoperimetry problem as an optimization problem with non-negative orthogonal constraints. We first make use of a gradient-based optimization algorithm to solve this kind of a problem, and then apply a post-processing technique to extract a solution of the clustering problem. Also, we propose a simplified approach to improve upon solution of the 2-dimensional clustering problem, using the N-nearest neighbor graph. Inspired by this technique, we apply a multilevel method for clustering a given data-set to reduce the size of the problem by grouping a number of similar vertices. The number is determined based on two values, namely, the maximum and the average of the edge weights of the vertices connected to a selected vertex. In addition, using the connections between ONMF and k-means and between k-means and the isoperimetry problem, we propose an algorithm to solve the ONMF problem. A comparative performance analysis of our approach with other related methods shows outperformance of our approach, in terms of the obtained misclassification error rate and Rand index, on both benchmark and randomly generated problems as well as hard synthetic data-sets.
Keywords: Clustering; Multilevel method; Normalized k-cut; Optimization problem; Orthogonal non-negative matrix factorization; 65K10; 90C27 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10288-020-00445-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:aqjoor:v:19:y:2021:i:4:d:10.1007_s10288-020-00445-y
Ordering information: This journal article can be ordered from
https://www.springer ... ch/journal/10288/PSE
DOI: 10.1007/s10288-020-00445-y
Access Statistics for this article
4OR is currently edited by Yves Crama, Michel Grabisch and Silvano Martello
More articles in 4OR from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().