A sensitivity analysis of a probabilistic information retrieval system
Paul Thompson
Journal of the American Society for Information Science, 1990, vol. 41, issue 5, 348-358
Abstract:
Results of a set of exploratory simulations to test the effects of errors in estimation of individual term probabilities on the performance of a probabilistic information retrieval system are presented. Searches were executed with various levels of term error on a test collection of probabilistically indexed information sources. The amount of error in the final probability of relevance used to rank sources introduced by these errors was analytically determined. The first simulation analyzed simulated rankings obtained by simulating probabilities of relevance according to the error distribution; in the second simulation measures of rank correlation between simulated rankings and the system's ranking were calculated; in the third, the effect of the error on retrieval performance using the measure expected search length was determined. It was found that substantial error was introduced into final probabilities of relevance, but that for low levels of term error the impact on ranking and retrieval performance was moderate, while even with high levels the actual ranking performed significantly better than a random ranking of retrieved sources. © 1990 John Wiley & Sons, Inc.
Date: 1990
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(199007)41:53.0.CO;2-K
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:41:y:1990:i:5:p:348-358
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().