CLUSTERING CATEGORICAL AND NUMERICAL DATA: A NEW PROCEDURE USING MULTIDIMENSIONAL SCALING
Sung-Gi Lee () and
Deok-Kyun Yun
Additional contact information
Sung-Gi Lee: Department of Industrial Engineering, Hanyang University, Seoul, South Korea;
Deok-Kyun Yun: Department of Industrial Engineering, Hanyang University, Seoul, South Korea;
International Journal of Information Technology & Decision Making (IJITDM), 2003, vol. 02, issue 01, 135-159
Abstract:
In this paper, we present a concept based on the similarity of categorical attribute values considering implicit relationships and propose a new and effective clustering procedure for mixed data. Our procedure obtains similarities between categorical values from careful analysis and maps the values in each categorical attribute into points in two-dimensional coordinate space using multidimensional scaling. These mapped values make it possible to interpret the relationships between attribute values and to directly apply categorical attributes to clustering algorithms using a Euclidean distance. After trivial modifications, our procedure for clustering mixed data uses thek-means algorithm, well known for its efficiency in clustering large data sets. We use the familiarsoybean diseaseandadultdata sets to demonstrate the performance of our clustering procedure. The satisfactory results that we have obtained demonstrate the effectiveness of our algorithm in discovering structure in data.
Keywords: Data mining; clustering; categorical attributes; multidimensional scaling; k-means algorithm (search for similar items in EconPapers)
Date: 2003
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219622003000549
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijitdm:v:02:y:2003:i:01:n:s0219622003000549
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0219622003000549
Access Statistics for this article
International Journal of Information Technology & Decision Making (IJITDM) is currently edited by Yong Shi
More articles in International Journal of Information Technology & Decision Making (IJITDM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().