An Empirical Evaluation of Similarity Coefficients for Binary Valued Data
David M. Lewis and
Vandana P. Janeja
Additional contact information
David M. Lewis: Carnegie Mellon University, USA
Vandana P. Janeja: University of Maryland, Baltimore County, USA
International Journal of Data Warehousing and Mining (IJDWM), 2011, vol. 7, issue 2, 44-66
Abstract:
In this paper, the authors present an empirical evaluation of similarity coefficients for binary valued data. Similarity coefficients provide a means to measure the similarity or distance between two binary valued objects in a dataset such that the attributes qualifying each object have a 0-1 value. This is useful in several domains, such as similarity of feature vectors in sensor networks, document search, router network mining, and web mining. The authors survey 35 similarity coefficients used in various domains and present conclusions about the efficacy of the similarity computed in (1) labeled data to quantify the accuracy of the similarity coefficients, (2) varying density of the data to evaluate the effect of sparsity of the values, and (3) varying number of attributes to see the effect of high dimensionality in the data on the similarity computed.
Date: 2011
References: Add references at CitEc
Citations:
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/jdwm.2011040103 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:igg:jdwm00:v:7:y:2011:i:2:p:44-66
Access Statistics for this article
International Journal of Data Warehousing and Mining (IJDWM) is currently edited by Eric Pardede
More articles in International Journal of Data Warehousing and Mining (IJDWM) from IGI Global
Bibliographic data for series maintained by Journal Editor ().