Retrieving records from a gigabyte of text on a minicomputer using statistical ranking

Harman, Donna; Candela, Gerald

Retrieving records from a gigabyte of text on a minicomputer using statistical ranking

Donna Harman and Gerald Candela

Journal of the American Society for Information Science, 1990, vol. 41, issue 8, 581-589

Abstract: Statistically based ranked retrieval of records using keywords provides many advantages over traditional Boolean retrieval methods, especially for end users. This approach to retrieval, however, has not seen widespread use in large operational retrieval systems. To show the feasibility of this retrieval methodology, research was done to produce very fast search techniques using these ranking algorithms, and then to test the results against large databases with many end users. The results show not only response times on the order of 1 and 1/2 seconds for 806 megabytes of text, but also very favorable user reaction. Novice users were able to consistently obtain good search results after 5 minutes of training. Additional work was done to devise new indexing techniques to create inverted files for large databases using a minicomputer. These techniques use no sorting, require a working space of only about 20% of the size of the input text, and produce indices that are about 14% of the input text size. © 1990 John Wiley & Sons, Inc.

Date: 1990
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(199012)41:83.0.CO;2-U

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:41:y:1990:i:8:p:581-589

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571

Access Statistics for this article

More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().