Automatic detection and correction of spelling errors in a large data base
Antonio Zamora
Journal of the American Society for Information Science, 1980, vol. 31, issue 1, 51-57
Abstract:
On‐line bibliographic search systems tend to increase the visibility of spelling errors through the use of indexes of unique terms; even low error rates in a data base can result in large numbers of misspelled terms in these indexes. This article describes the techniques used to detect and correct spelling errors in the data base of Chemical Abstracts Service. A computer program for spelling error detection achieves a high level of performance using hashing techniques for dictionary look‐up and compression. Heuristic procedures extend the dictionary and increase the proportion of misspelled words in the words flagged. Automatic correction procedures are applied only to words which are known to be misspelled; other corrections are performed manually during the normal editorial cycle. The constraints imposed on the selection of a spelling error detection technique by a complex data base, human factors, and high‐volume production are discussed.
Date: 1980
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.4630310106
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:31:y:1980:i:1:p:51-57
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().