EconPapers    
Economics at your fingertips  
 

An entropy‐based interpretation of retrieval status value‐based retrieval, and its application to the computation of term and query discrimination value

Sándor Dominich, Júlia Góth, Tamás Kiezer and Zoltán Szlávik

Journal of the American Society for Information Science and Technology, 2004, vol. 55, issue 7, 613-627

Abstract: The concepts of Shannon information and entropy have been applied to a number of information retrieval tasks such as to formalize the probabilistic model, to design practical retrieval systems, to cluster documents, and to model texture in image retrieval. In this report, the concept of entropy is used for a different purpose. It is shown that any positive Retrieval Status Value (RSV)‐based retrieval system may be conceived as a special probability space in which the amount of the associated Shannon information is being reduced; in this view, the retrieval system is referred to as Uncertainty Decreasing Operation (UDO). The concept of UDO is then proposed as a theoretical background for term and query discrimination power, and it is applied to the computation of term and query discrimination values in the vector space retrieval model. Experimental evidence is given as regards such computation; the results obtained compare well to those obtained using vector‐based calculation of term discrimination values. The UDO‐based computation, however, presents advantages over the vector‐based calculation: It is faster, easier to assess and handle in practice, and its application is not restricted to the vector space model. Based on the ADI test collection, it is shown that the UDO‐based Term Discrimination Value (TDV) weighting scheme yields better retrieval effectiveness than using the vector‐based TDV weighting scheme. Also, experimental evidence is given to the intuition that the choice of an appropriate weighting scheme and similarity measure depends on collection properties, and thus the UDO approach may be used as a theoretical basis for this intuition.

Date: 2004
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/asi.20008

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:55:y:2004:i:7:p:613-627

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890

Access Statistics for this article

More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamist:v:55:y:2004:i:7:p:613-627