Intelligent OCR processing
Wei Sun,
Lon‐Mu Liu,
Weining Zhang and
John Craig Comfort
Journal of the American Society for Information Science, 1992, vol. 43, issue 6, 422-431
Abstract:
Optical Character Recognition (OCR) has become a highly demanded information transfer technology in recent years. This demand has been driven by the increasing needs for information sharing and office automation, and by the increasing accessibility to large‐scale, fast, and powerful computer resources. A problem of current OCR technology is that texts produced by the state‐of‐the‐art OCR software contain an unacceptable frequency of errors. This prevents the OCR technology from being efficiently used for vast‐volume information transfer or daily office operation applications. To correct these errors in a conventional way requires a significant amount of costly human‐machine interaction. In this article, we identify and classify the types and distributions of optical recognition errors. We propose a novel post‐processing strategy, based on machine learning techniques, to correct errors resulted from unrecognized or misrecognized characters during the recognition process. By applying this strategy, the accuracy of recognition can be significantly improved, and the human interaction required can be dramatically reduced. Experimental results indicate that, in a typical environment, about 46% of total errors can be corrected automatically (i.e., without human interference), with an accuracy of 91%. © 1992 John Wiley & Sons, Inc.
Date: 1992
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/(SICI)1097-4571(199207)43:63.0.CO;2-Z
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamest:v:43:y:1992:i:6:p:422-431
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1097-4571
Access Statistics for this article
More articles in Journal of the American Society for Information Science from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().