EconPapers    
Economics at your fingertips  
 

A SEQUENCE-ELEMENT-BASED HIERARCHICAL CLUSTERING ALGORITHM FOR CATEGORICAL SEQUENCE DATA

Seung-Joon Oh () and Jae-Yearn Kim ()
Additional contact information
Seung-Joon Oh: Department of Industrial Engineering, Hanyang University, 17 Haengdang-Dong, Sungdong-Ku, Seoul, 133-791, South Korea
Jae-Yearn Kim: Department of Industrial Engineering, Hanyang University, 17 Haengdang-Dong, Sungdong-Ku, Seoul, 133-791, South Korea

International Journal of Information Technology & Decision Making (IJITDM), 2005, vol. 04, issue 01, 81-96

Abstract: Recently, there has been enormous growth in the amount of commercial and scientific data, such as protein sequences, retail transactions, and web-logs. Such datasets consist of sequence data that have an inherent sequential nature. However, few existing clustering algorithms consider sequentiality. In this paper, we study how to cluster these sequence datasets. We propose a new similarity measure to compute the similarity between two sequences. In the proposed measure, subsets of a sequence are considered, and the more identical subsets there are, the more similar the two sequences. In addition, we propose a hierarchical clustering algorithm and an efficient method for measuring similarity. Using a splice dataset and synthetic datasets, we show that the quality of clusters generated by our proposed approach is better than that of clusters produced by traditional clustering algorithms.

Keywords: Data mining; hierarchical clustering; sequences; similarity measure (search for similar items in EconPapers)
Date: 2005
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219622005001398
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijitdm:v:04:y:2005:i:01:n:s0219622005001398

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219622005001398

Access Statistics for this article

International Journal of Information Technology & Decision Making (IJITDM) is currently edited by Yong Shi

More articles in International Journal of Information Technology & Decision Making (IJITDM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-03-20
Handle: RePEc:wsi:ijitdm:v:04:y:2005:i:01:n:s0219622005001398