EconPapers    
Economics at your fingertips  
 

A New Similarity Metric for Sequential Data

Pradeep Kumar, Bapi S. Raju and P. Radha Krishna
Additional contact information
Pradeep Kumar: Indian Institute of Management, India
Bapi S. Raju: University of Hyderabad, India
P. Radha Krishna: Infosys Technologies Limited, Hyderabad, India

International Journal of Data Warehousing and Mining (IJDWM), 2010, vol. 6, issue 4, 16-32

Abstract: In many data mining applications, both classification and clustering algorithms require a distance/similarity measure. The central problem in similarity based clustering/classification comprising sequential data is deciding an appropriate similarity metric. The existing metrics like Euclidean, Jaccard, Cosine, and so forth do not exploit the sequential nature of data explicitly. In this paper, the authors propose a similarity preserving function called Sequence and Set Similarity Measure (S3M) that captures both the order of occurrence of items in sequences and the constituent items of sequences. The authors demonstrate the usefulness of the proposed measure for classification and clustering tasks. Experiments were conducted on benchmark datasets, that is, DARPA’98 and msnbc, for classification task in intrusion detection and clustering task in web mining domains. Results show the usefulness of the proposed measure.

Date: 2010
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/jdwm.2010100102 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:6:y:2010:i:4:p:16-32

Access Statistics for this article

International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede

More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-04-12
Handle: RePEc:igg:jdwm00:v:6:y:2010:i:4:p:16-32