EconPapers    
Economics at your fingertips  
 

Sequential dimension reduction and clustering of mixed-type data

Angelos Markos, Odysseas Moschidis and Theodore Chadjipantelis

International Journal of Data Analysis Techniques and Strategies, 2020, vol. 12, issue 3, 228-246

Abstract: Clustering of a set of objects described by a mixture of continuous and categorical variables can be a challenging task. In the context of data reduction, an effective class of methods combine dimension reduction with clustering in the reduced space. In this paper, we review three approaches for sequential dimension reduction and clustering of mixed-type data. The first step of each approach involves the application of principal component analysis on a suitably transformed matrix. In the second step, a partitioning or hierarchical clustering algorithm is applied to the object scores in the reduced space. The common theoretical underpinnings of the three approaches are highlighted. The results of a benchmarking study show that sequential dimension reduction and clustering is an effective strategy, especially when categorical variables are more informative than continuous with regard to the underlying cluster structure. Strengths and limitations are also demonstrated on a real mixed-type dataset.

Keywords: cluster analysis; dimension reduction; correspondence analysis; principal component analysis; PCA; mixed-type data. (search for similar items in EconPapers)
Date: 2020
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.inderscience.com/link.php?id=108043 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:injdan:v:12:y:2020:i:3:p:228-246

Access Statistics for this article

More articles in International Journal of Data Analysis Techniques and Strategies from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().

 
Page updated 2025-03-19
Handle: RePEc:ids:injdan:v:12:y:2020:i:3:p:228-246