Indexing exhaustivity and the computation of similarity matrices

Harding, Alan F.; Willett, Peter

Indexing exhaustivity and the computation of similarity matrices

Alan F. Harding and Peter Willett

Journal of the American Society for Information Science, 1980, vol. 31, issue 4, 298-300

Abstract: Some of the automatic classification procedures used in information retrieval derive clusters of documents from an intermediate similarity matrix, the computation of which involves comparing each of the documents in the collection with all of the others. It has recently been suggested that many of these comparisons, specifically those between documents having no terms in common, may be avoided by means of the use of an inverted file to the document collection. This communication shows that the approach will effect reductions in the number of interdocument comparisons only if the documents are each indexed by a limited number of indexing terms; if exhaustive indexing is used, many document pairs will be compared several times over and the computation will be greater than when conventional approaches are used to generate the similarity matrix.

Date: 1980
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/asi.4630310411

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:31:y:1980:i:4:p:298-300

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571

Access Statistics for this article

More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().