EconPapers    
Economics at your fingertips  
 

Comparison of Similarity Measures for Categorical Data in Hierarchical Clustering

Zdeněk Šulc () and Hana Řezanková ()
Additional contact information
Zdeněk Šulc: University of Economics, Prague
Hana Řezanková: University of Economics, Prague

Journal of Classification, 2019, vol. 36, issue 1, No 4, 58-72

Abstract: Abstract This paper deals with similarity measures for categorical data in hierarchical clustering, which can deal with variables with more than two categories, and which aspire to replace the simple matching approach standardly used in this area. These similarity measures consider additional characteristics of a dataset, such as a frequency distribution of categories or the number of categories of a given variable. The paper recognizes two main aims. First, to compare and evaluate the selected similarity measures regarding the quality of produced clusters in hierarchical clustering. Second, to propose new similarity measures for nominal variables. All the examined similarity measures are compared regarding the quality of the produced clusters using the mean ranked scores of two internal evaluation coefficients. The analysis is performed on the generated datasets, and thus, it allows determining in which particular situations a certain similarity measure is recommended for use.

Keywords: Similarity measures; Nominal variables; Hierarchical cluster analysis; Comparison; Evaluation (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (7)

Downloads: (external link)
http://link.springer.com/10.1007/s00357-019-09317-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jclass:v:36:y:2019:i:1:d:10.1007_s00357-019-09317-5

Ordering information: This journal article can be ordered from
http://www.springer. ... hods/journal/357/PS2

DOI: 10.1007/s00357-019-09317-5

Access Statistics for this article

Journal of Classification is currently edited by Douglas Steinley

More articles in Journal of Classification from Springer, The Classification Society
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:jclass:v:36:y:2019:i:1:d:10.1007_s00357-019-09317-5