Text searching retrieval of answer‐sentences and other answer‐passages
John O'Connor
Journal of the American Society for Information Science, 1973, vol. 24, issue 6, 445-460
Abstract:
Some new text searching retrieval techniques are described which retrieve not documents but sentences from documents and sometimes (on occasions determined by the computer) multi‐sentence sequences. Since the goal of the techniques is retrieval of answer‐providing documents, “answer‐passages” are retrieved. An “answer‐passage” is a passage which is either answer‐providing or “answer‐indicative,” i.e., it permits inferring that the document containing it is answer‐provding. In most cases answer‐sentences, i.e., single‐sentence answer‐passages, are retrieved. This has great advantages for screening retrieval output. Two new automatic procedures for measuring closeness of relation between clue words in a sentence are described. One approximates syntactic closeness by counting the number of intervening “syntactic joints” (roughly speaking, prepositions, conjunctions and punctuation marks) between successive clue words. The other measure uses word proximity in a new way. The two measures perform about equally well. The computer uses “enclosure” and “connector words” for determining when a multi‐sentence passage should be retrieved. However, no procedure was found in this study for retrieving multi‐paragraph answer‐passages, which were the only answer‐passages occurring in 6% of the papers. In a test of the techniques they failed to retrieve two answer‐providing documents (7% of those to be retrieved) because of one multi‐paragraph answer‐passage and one complete failure of clue word selection. For the other answer‐providing documents they retrieved at all recall levels with greater precision than SMART, which has produced the best previously reported recall‐precision results. The retrieval questions (mostly from real users) and documents used in this study were from the field of information science. The results of the study are surprisingly good for retrieval in such a “soft science,” and it is reasonable to hope that in less “soft” sciences and technologies the techniques described will work even better. On this basis a dissemination and retrieval system of the near future is predicted.
Date: 1973
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.4630240606
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:24:y:1973:i:6:p:445-460
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().