The limitations of term co‐occurrence data for query expansion in document retrieval systems
Helen J. Peat and
Peter Willett
Journal of the American Society for Information Science, 1991, vol. 42, issue 5, 378-383
Abstract:
Term cooccurrence data has been extensively used in document retrieval systems for the identification of indexing terms that are similar to those that have been specified in a user query: these similar terms can then be used to augment the original query statement. Despite the plausibility of this approach to query expansion, the retrieval effectiveness of the expanded queries is often no greater than, or even less than, the effectiveness of the unexpanded queries. This article demonstrates that the similar terms identified by cooccurrence data in a query expansion system tend to occur very frequently in the database that is being searched. Unfortunately, frequent terms tend to discriminate poorly between relevant and nonrelevant documents, and the general effect of query expansion is thus to add terms that do little or nothing to improve the discriminatory power of the original query. © 1991 John Wiley & Sons, Inc.
Date: 1991
References: Add references at CitEc
Citations: View citations in EconPapers (6)
Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(199106)42:53.0.CO;2-8
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:42:y:1991:i:5:p:378-383
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().