Three-Way Ensemble Clustering Based on Sample’s Perturbation Theory
Jiachen Fan,
Xiaoxiao Wang,
Tingfeng Wu,
Jin Zhu and
Pingxin Wang
Additional contact information
Jiachen Fan: School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212003, China
Xiaoxiao Wang: School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212003, China
Tingfeng Wu: School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212003, China
Jin Zhu: School of Science, Jiangsu University of Science and Technology, Zhenjiang 212003, China
Pingxin Wang: School of Science, Jiangsu University of Science and Technology, Zhenjiang 212003, China
Mathematics, 2022, vol. 10, issue 15, 1-19
Abstract:
The complexity of the data type and distribution leads to the increase in uncertainty in the relationship between samples, which brings challenges to effectively mining the potential cluster structure of data. Ensemble clustering aims to obtain a unified cluster division by fusing multiple different base clustering results. This paper proposes a three-way ensemble clustering algorithm based on sample’s perturbation theory to solve the problem of inaccurate decision making caused by inaccurate information or insufficient data. The algorithm first combines the natural nearest neighbor algorithm to generate two sets of perturbed data sets, randomly extracts the feature subsets of the samples, and uses the traditional clustering algorithm to obtain different base clusters. The sample’s stability is obtained by using the co-association matrix and determinacy function, and then the samples can be divided into a stable region and unstable region according to a threshold for the sample’s stability. The stable region consists of high-stability samples and is divided into the core region of each cluster using the K-means algorithm. The unstable region consists of low-stability samples and is assigned to the fringe regions of each cluster. Therefore, a three-way clustering result is formed. The experimental results show that the proposed algorithm in this paper can obtain better clustering results compared with other clustering ensemble algorithms on the UCI Machine Learning Repository data set, and can effectively reveal the clustering structure.
Keywords: three-way clustering; natural nearest neighbor; sample’s perturbation theory; ensemble clustering (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/10/15/2598/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/15/2598/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:15:p:2598-:d:871803
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().