Addressing class imbalance in functional data clustering
Catherine Higgins () and
Michelle Carey ()
Additional contact information
Catherine Higgins: University College Dublin, School of Mathematics and Statistics
Michelle Carey: University College Dublin, School of Mathematics and Statistics
Advances in Data Analysis and Classification, 2025, vol. 19, issue 4, No 7, 1023-1050
Abstract:
Abstract The goal of functional clustering is twofold: first, to categorize curves with similar temporal behaviors into separate clusters, and second, to obtain a representative curve that summarizes the typical temporal behavior within each cluster. An important challenge in current functional clustering techniques is class imbalance, where some clusters contain a significantly greater number of curves than others. While class imbalance is extensively addressed in supervised classification, it remains relatively unexplored in unsupervised contexts. To address this gap, we propose adapting the iterative hierarchical clustering approach, originally designed for multivariate data, to the context of functional data. Thus introducing a novel method called functional iterative hierarchical clustering (funIHC) to effectively handle the clustering of imbalanced functional data. Through comprehensive simulation studies and benchmarking datasets, we demonstrate the effectiveness of the funIHC approach. Utilizing funIHC on gene expression data related to human influenza infection induced by the H3N2 virus, we identify five distinct and biologically meaningful patterns of gene expression. The R and MATLAB code for implementing funIHC is freely accessible at www.fdaatucd.com .
Keywords: Functional data analysis; Clustering; Class imbalance; Gene expression; 62H30; 62M10 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11634-024-00611-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:19:y:2025:i:4:d:10.1007_s11634-024-00611-8
Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2
DOI: 10.1007/s11634-024-00611-8
Access Statistics for this article
Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs
More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().