Document representation in probabilistic models of information retrieval
W. Bruce Croft
Journal of the American Society for Information Science, 1981, vol. 32, issue 6, 451-457
Abstract:
Probabilistic models of retrieval have provided insights into the document retrieval process and contain the basis for very effective search strategies. A major limitation of these models is that they assume that documents are represented by binary index terms. In many cases the index terms will be assigned weights, such as within‐document frequency weights, which are derived from the content of the documents by the indexing process. These weights, which are referred to here as term significance weights, indicate the relative importance of the terms in individual documents. This article describes how retrieval models which use either independence or dependence assumptions can be extended to include document representatives containing term significance weights. Comparison with other research indicates that search strategies based on models modified in this way can further improve the effectiveness of document retrieval systems.
Date: 1981
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.4630320609
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:32:y:1981:i:6:p:451-457
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().