EconPapers    
Economics at your fingertips  
 

Gradual OCR: An Effective OCR Approach Based on Gradual Detection of Texts

Youngki Park and Youhyun Shin ()
Additional contact information
Youngki Park: Department of Computer Education, Chuncheon National University of Education, Chuncheon 24328, Republic of Korea
Youhyun Shin: Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea

Mathematics, 2023, vol. 11, issue 22, 1-20

Abstract: In this paper, we present a novel approach to optical character recognition that incorporates various supplementary techniques, including the gradual detection of texts and gradual filtering of inaccurately recognized texts. To minimize false negatives, we attempt to detect all text by incrementally lowering the relevant thresholds. To mitigate false positives, we implement a novel filtering method that dynamically adjusts based on the confidence levels of recognized texts and their corresponding detection thresholds. Additionally, we use straightforward yet effective strategies to enhance the optical character recognition accuracy and speed, such as upscaling, link refinement, perspective transformation, the merging of cropped images, and simple autoregression. Given our focus on Korean chart data, we compile a mix of real-world and artificial Korean chart datasets for experimentation. Our experimental results show that our approach outperforms Tesseract by approximately 7 to 15 times and EasyOCR by 3 to 5 times in accuracy, as measured using a Jaccard similarity-based error rate on our datasets.

Keywords: optical character recognition; gradual OCR; gradual text detection; gradual low-quality filtering (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/22/4585/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/22/4585/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:22:p:4585-:d:1276733

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:11:y:2023:i:22:p:4585-:d:1276733