A novel centroids initialisation for K-means clustering in the presence of benign outliers

Karami, Amin; UrrÃ©hman, Shafiq; Ghazanfar, Mustansar Ali

A novel centroids initialisation for K-means clustering in the presence of benign outliers

Amin Karami, Shafiq UrrÃ©hman and Mustansar Ali Ghazanfar

International Journal of Data Analysis Techniques and Strategies, 2020, vol. 12, issue 4, 287-298

Abstract: K-means is one of the most important and widely applied clustering algorithms in learning systems. However, it suffers from centroids initialisation that makes K-means algorithm unstable. The performance and the stability of the K-means algorithm may be degraded if benign outliers (i.e., long-term independence data points) appear in data. In this paper, we developed a novel algorithm to optimise K-means performance in the presence of benign outliers. We firstly identified the benign outliers and executed K-means across them, then K-means runs over all data points to re-locate clusters' centroids, providing high accuracy. The experimental results over several benchmarking and synthetic datasets confirm that the proposed method significantly outperformed some existing approaches with better accuracy based on applied performance metrics.

Keywords: clustering; K-means; centroid initialisation; benign outlier. (search for similar items in EconPapers)
Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.inderscience.com/link.php?id=111498 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:12:y:2020:i:4:p:287-298

Access Statistics for this article

More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().