Indexing exhaustivity and the computation of similarity matrices
Alan F. Harding and
Peter Willett
Journal of the American Society for Information Science, 1980, vol. 31, issue 4, 298-300
Abstract:
Some of the automatic classification procedures used in information retrieval derive clusters of documents from an intermediate similarity matrix, the computation of which involves comparing each of the documents in the collection with all of the others. It has recently been suggested that many of these comparisons, specifically those between documents having no terms in common, may be avoided by means of the use of an inverted file to the document collection. This communication shows that the approach will effect reductions in the number of interdocument comparisons only if the documents are each indexed by a limited number of indexing terms; if exhaustive indexing is used, many document pairs will be compared several times over and the computation will be greater than when conventional approaches are used to generate the similarity matrix.
Date: 1980
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.4630310411
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:31:y:1980:i:4:p:298-300
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().