Relevancy and pertinency in indexing
Alan M. Rees
American Documentation, 1962, vol. 13, issue 1, 93-94
Abstract:
Underlying all types of subject analysis—descriptors, uniterms, subject headings, telegraphic abstracting, etc.—is the fundamental problem of selection of significant concepts and characteristics from a document to be recorded as reference points for use in future retrieval operations. Faced with several thousands of words normally found in a typical document, the analyst selects those words and ideas which seem significant, based upon his subjective knowledge of the subject matter. Such a selection is conditioned by his academic training, observance of the frequency of occurrence of certain words, knowledge of the pattern of use of the literature, acquaintance with the terminology used in the phrasing of questions to be put to the file, and comparative knowledge and ignorance of the association of ideas or relationships between the concepts recorded in the document. Pertinency is therefore in the eyes of the beholder and is relevant to the state of knowledge at any given time. The difficulty in subject analysis is one of recording characteristics for retrieval at a later date when the implications inherent in future requests are unknown at the time of recording and when the terminology has not yet crystallized into any standardized form. In the absence of a permanent description, information requests can either be translated into the archaic language and frozen concepts of the file or the file itself can be updated to match modern concepts and associations and to bring out implications subsequently made apparent by continually evolving technology. The continuous shift in traditional interests is illustrated in the current awareness type of literature search where the constant rearrangement of concepts is seen in the attempt to define interests whose relevance is not yet established. Superimposed on this is the problem of finding suitable words which characterize these shifting concepts. The words of the document are not necessarily those which are in current use, nor will they always be the same words used to characterize an information request put to the file at a later date. Thus it is necessary to use an artificial language (code, authority, list, notation, etc.) into which the natural language of the text and the languge of the request can be converted. This language should be such that it would serve as a more permanent and regularized language which would cut through the tangle of synonyms and infinity of syntactic structures. The coded thesaurus is suggested as a means of providing for this intermediary language at the same time as performing the function of being a means of bringing into coincidence the vocabularies of the future searches and retrieval system and indicate networks of related meaning and associated ideas. The association of ideas in the semantic code is suggested as a yardstick of predetermined relevancy. Experimental data will be presented to facilitate the establishment of objective criteria of relevancy and pertinency in searching operations.
Date: 1962
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.5090130113
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:amedoc:v:13:y:1962:i:1:p:93-94
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1936-6108
Access Statistics for this article
American Documentation is currently edited by Javed Mostafa
More articles in American Documentation from Wiley Blackwell
Bibliographic data for series maintained by Wiley Content Delivery ().