Ensemble clustering for step data via binning
Ja‐Yoon Jang,
Hee‐Seok Oh,
Yaeji Lim and
Ying Kuen Cheung
Biometrics, 2021, vol. 77, issue 1, 293-304
Abstract:
This paper considers the clustering problem of physical step count data recorded on wearable devices. Clustering step data give an insight into an individual's activity status and further provide the groundwork for health‐related policies. However, classical methods, such as K‐means clustering and hierarchical clustering, are not suitable for step count data that are typically high‐dimensional and zero‐inflated. This paper presents a new clustering method for step data based on a novel combination of ensemble clustering and binning. We first construct multiple sets of binned data by changing the size and starting position of the bin, and then merge the clustering results from the binned data using a voting method. The advantage of binning, as a critical component, is that it substantially reduces the dimension of the original data while preserving the essential characteristics of the data. As a result, combining clustering results from multiple binned data can provide an improved clustering result that reflects both local and global structures of the data. Simulation studies and real data analysis were carried out to evaluate the empirical performance of the proposed method and demonstrate its general utility.
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1111/biom.13258
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:biomet:v:77:y:2021:i:1:p:293-304
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0006-341X
Access Statistics for this article
More articles in Biometrics from The International Biometric Society
Bibliographic data for series maintained by Wiley Content Delivery ().