An information‐theoretic measure of term specificity
S. K. M. Wong and
Y. Y. Yao
Journal of the American Society for Information Science, 1992, vol. 43, issue 1, 54-61
Abstract:
The inverse document frequency (IDF) and signal‐noise ratio (S/N) approaches are two well known term weighting schemes based on term specificity. However, the existing justifications for these methods are still somewhat inconclusive and sometimes even based on incompatible assumptions. Although both methods are related to term specificity, their relationship has not been thoroughly investigated. An information‐theoretic measure for term specificity is introduced in this study. It is explicitly shown that the IDF weighting scheme can be derived from the proposed approach by assuming that the frequency of occurrence of each index term is uniform within the set of documents containing the term. The information‐theoretic interpretation of term specificity also establishes the relationship between the IDF and S/N methods. © 1992 John Wiley & Sons, Inc.
Date: 1992
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(199201)43:13.0.CO;2-A
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:43:y:1992:i:1:p:54-61
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().