A Clustering Ensemble Framework with Integration of Data Characteristics and Structure Information: A Graph Neural Networks Approach

Du, Hang-Yuan; Wang, Wen-Jian

A Clustering Ensemble Framework with Integration of Data Characteristics and Structure Information: A Graph Neural Networks Approach

Hang-Yuan Du and Wen-Jian Wang
Additional contact information
Hang-Yuan Du: School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
Wen-Jian Wang: School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China

Mathematics, 2022, vol. 10, issue 11, 1-23

Abstract: Clustering ensemble is a research hotspot of data mining that aggregates several base clustering results to generate a single output clustering with improved robustness and stability. However, the validity of the ensemble result is usually affected by unreliability in the generation and integration of base clusterings. In order to address this issue, we develop a clustering ensemble framework viewed from graph neural networks that generates an ensemble result by integrating data characteristics and structure information. In this framework, we extract structure information from base clustering results of the data set by using a coupling affinity measure After that, we combine structure information with data characteristics by using a graph neural network (GNN) to learn their joint embeddings in latent space. Then, we employ a Gaussian mixture model (GMM) to predict the final cluster assignment in the latent space. Finally, we construct the GNN and GMM as a unified optimization model to integrate the objectives of graph embedding and consensus clustering. Our framework can not only elegantly combine information in feature space and structure space, but can also achieve suitable representations for final cluster partitioning. Thus, it can produce an outstanding result. Experimental results on six synthetic benchmark data sets and six real world data sets show that the proposed framework yields a better performance compared to 12 reference algorithms that are developed based on either clustering ensemble architecture or a deep clustering strategy.

Keywords: clustering ensemble; graph neural networks; graph embedding; structure information extraction; information integration; generative model (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.mdpi.com/2227-7390/10/11/1834/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/11/1834/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:11:p:1834-:d:824956

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().