EconPapers    
Economics at your fingertips  
 

Clustering Categorical Data: A Survey

Sami Naouali (), Semeh Ben Salem and Zied Chtourou ()
Additional contact information
Sami Naouali: Virtual Reality and Information Technologies, Military Academy of Fondouk Jedid, Nabeul, Tunisia
Semeh Ben Salem: Polytechnic School of Tunisia, La Marsa, Tunis B.P. 743, Rue El Khawarizmi 2078, Tunisia
Zied Chtourou: Digital Research Center of Sfax, B.P. 275, Sakiet Ezzit, Sfax 3021, Tunisia

International Journal of Information Technology & Decision Making (IJITDM), 2020, vol. 19, issue 01, 49-96

Abstract: Clustering is a complex unsupervised method used to group most similar observations of a given dataset within the same cluster. To guarantee high efficiency, the clustering process should ensure high accuracy and low complexity. Many clustering methods were developed in various fields depending on the type of application and the data type considered. Categorical clustering considers segmenting a dataset in which the data are categorical and were widely used in many real-world applications. Thus several methods were developed including hard, fuzzy and rough set-based methods. In this survey, more than 30 categorical clustering algorithms were investigated. These methods were classified into hierarchical and partitional clustering methods and classified in terms of their accuracy, precision and recall to identify the most prominent ones. Experimental results show that rough set-based clustering methods provided better efficiency than hard and fuzzy methods. Besides, methods based on the initialization of the centroids also provided good results.

Keywords: Unsupervised learning; categorical data clustering; rough set theory; fuzzy clustering; hard clustering (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.worldscientific.com/doi/abs/10.1142/S0219622019300064
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijitdm:v:19:y:2020:i:01:n:s0219622019300064

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219622019300064

Access Statistics for this article

International Journal of Information Technology & Decision Making (IJITDM) is currently edited by Yong Shi

More articles in International Journal of Information Technology & Decision Making (IJITDM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-03-20
Handle: RePEc:wsi:ijitdm:v:19:y:2020:i:01:n:s0219622019300064