Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity
Dehua Peng,
Zhipeng Gui (),
Dehe Wang,
Yuncheng Ma,
Zichen Huang,
Yu Zhou and
Huayi Wu
Additional contact information
Dehua Peng: Wuhan University
Zhipeng Gui: Wuhan University
Dehe Wang: Wuhan University
Yuncheng Ma: Wuhan University
Zichen Huang: Wuhan University
Yu Zhou: Wuhan University
Huayi Wu: Wuhan University
Nature Communications, 2022, vol. 13, issue 1, 1-14
Abstract:
Abstract Clustering is a powerful machine learning method for discovering similar patterns according to the proximity of elements in feature space. It is widely used in computer science, bioscience, geoscience, and economics. Although the state-of-the-art partition-based and connectivity-based clustering methods have been developed, weak connectivity and heterogeneous density in data impede their effectiveness. In this work, we propose a boundary-seeking Clustering algorithm using the local Direction Centrality (CDC). It adopts a density-independent metric based on the distribution of K-nearest neighbors (KNNs) to distinguish between internal and boundary points. The boundary points generate enclosed cages to bind the connections of internal points, thereby preventing cross-cluster connections and separating weakly-connected clusters. We demonstrate the validity of CDC by detecting complex structured clusters in challenging synthetic datasets, identifying cell types from single-cell RNA sequencing (scRNA-seq) and mass cytometry (CyTOF) data, recognizing speakers on voice corpuses, and testifying on various types of real-world benchmarks.
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.nature.com/articles/s41467-022-33136-9 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33136-9
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-022-33136-9
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().