A Deep Learning Based Offline Optical Character Recognition Model for Printed Ottoman Turkish

Al-Khaffaf, Ahmed

A Deep Learning Based Offline Optical Character Recognition Model for Printed Ottoman Turkish

Ahmed Al-Khaffaf ()

Technium, 2023, vol. 18, issue 1, 47-64

Abstract: Developing efficient optical character recognition (OCR) systems for printed Ottoman text is a problem since current OCR models created for Arabic have restrictions that make it difficult to be performed. The performance of these models has been shown to be low when used for the recognition of Ottoman text. It has also been shown that these models that have been subjected to specialized training on Ottoman text have produced results that are not sufficient. In this study, an analysis of printed Ottoman Turkish documents in the Matbu font is conducted using a deep learning model that is proposed. Through the use of an end-to-end trainable architecture that integrates convolutional neural networks (CNNs) with bidirectional long short-term memory (BiLSTM) units, this study proposes an efficient solution to the Ottoman optical character recognition (OCR) issue. Experimental results show that the proposed model achieved overall scores for accuracy, sensitivity, and precision of 99.6%, 87.1%, and 93.3% on the test dataset respectively.

Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://techniumscience.com/index.php/technium/article/view/10252/3986 (application/pdf)
https://techniumscience.com/index.php/technium/article/view/10252 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tec:techni:v:18:y:2023:i:1:p:47-64

DOI: 10.47577/technium.v18i.10252

Access Statistics for this article

Technium is currently edited by Scurtu Ionut Cristian

More articles in Technium from Technium Science
Bibliographic data for series maintained by Ana Maria Golita ().