EconPapers    
Economics at your fingertips  
 

A black-winged kite improved fuzzy clustering handling imbalanced uncertain data

Hung Tran-Nam and Ha Che-Ngoc

PLOS ONE, 2026, vol. 21, issue 6, 1-44

Abstract: Clustering uncertain data is a fundamental problem in data mining. Imbalance among uncertain objects significantly degrades clustering performance, as minority clusters are repeatedly overshadowed by dominant ones. Consequently, existing clustering techniques often fail due to initialisation biases and inadequate similarity modelling. This paper proposes a novel algorithm, the Black-winged Kite Improved Fuzzy clustering for probability density Functions (BKIFF), which combines an optimisation-based initialisation strategy with an enhanced fuzzy clustering framework. Specifically, BKIFF incorporates the Hellinger distance into the clustering objective to more reliably capture similarities between probability density functions (pdfs), and introduces improved membership updating and prototype estimation mechanisms tailored for uncertain and imbalanced data formulated as Improved Fuzzy clustering for probability density Functions (IFF) while theoretical convergence is established. In addition, the algorithm employs Black-winged Kite Optimisation (BKO) to enhance prototype selection, improving clustering stability and convergence. As a result, comprehensive experiments with synthetic Gaussian probability distributions, skewed pdfs, and real-world image datasets demonstrate that BKIFF consistently outperforms baseline methods such as FCF, FCF-ℒ1, KMEANS, and Self-Updating. Across all three examples, BKIFF achieves near-perfect ARI, improving from near-zero values in highly imbalanced cases {20,50,80,100} by approximately 30–35% in moderate settings, while increasing NMI by about 25–95%. Additionally, it reduces computational time by approximately 95–99% compared to baseline methods. In conclusion, BKIFF demonstrates superior performance and opens up new possibilities for applications in medical diagnostics, ecological analysis, and high-dimensional uncertain data mining, particularly in imbalanced environments.

Date: 2026
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0349753 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 49753&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0349753

DOI: 10.1371/journal.pone.0349753

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2026-06-14
Handle: RePEc:plo:pone00:0349753