A New Similarity Metric for Sequential Data
Pradeep Kumar,
Bapi S. Raju and
P. Radha Krishna
Additional contact information
Pradeep Kumar: Indian Institute of Management, India
Bapi S. Raju: University of Hyderabad, India
P. Radha Krishna: Infosys Technologies Limited, Hyderabad, India
International Journal of Data Warehousing and Mining (IJDWM), 2010, vol. 6, issue 4, 16-32
Abstract:
In many data mining applications, both classification and clustering algorithms require a distance/similarity measure. The central problem in similarity based clustering/classification comprising sequential data is deciding an appropriate similarity metric. The existing metrics like Euclidean, Jaccard, Cosine, and so forth do not exploit the sequential nature of data explicitly. In this paper, the authors propose a similarity preserving function called Sequence and Set Similarity Measure (S3M) that captures both the order of occurrence of items in sequences and the constituent items of sequences. The authors demonstrate the usefulness of the proposed measure for classification and clustering tasks. Experiments were conducted on benchmark datasets, that is, DARPA’98 and msnbc, for classification task in intrusion detection and clustering task in web mining domains. Results show the usefulness of the proposed measure.
Date: 2010
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/jdwm.2010100102 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:6:y:2010:i:4:p:16-32
Access Statistics for this article
International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede
More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().