EconPapers    
Economics at your fingertips  
 

Build a Trained Data of Tesseract OCR engine for Tifinagh Script Recognition

Ali Benaissa, Abdelkhalak Bahri, Ahmad El Allaoui and My Abdelouahab Salahddine

Data and Metadata, 2023, vol. 2, 185

Abstract: This article introduces a methodology for constructing a trained dataset to facilitate Tifinagh script recognition using the Tesseract OCR engine. The Tifinagh script, widely used in North Africa, poses a challenge due to the lack of built-in recognition capabilities in Tesseract. To overcome this limitation, our approach focuses on image generation, box generation, manual editing, charset extraction, and dataset compilation. By leveraging Python scripting, specialized software tools, and Tesseract's training utilities, we systematically create a comprehensive dataset for Tifinagh script recognition. The dataset enables the training and evaluation of machine learning models, leading to accurate character recognition. Experimental results demonstrate high accuracy, precision, recall, and F1 score, affirming the effectiveness of the dataset and its potential for practical applications. The results highlight the robustness of the OCR system, achieving an outstanding accuracy rate of 99,97 %. The discussion underscores its superior performance in Tifinagh character recognition, exceeding the findings in the field. This methodology contributes significantly to enhancing OCR technology capabilities and encourages further research in Tifinagh script recognition, unlocking the wealth of information contained in Tifinagh documents

Date: 2023
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:dbk:datame:v:2:y:2023:i::p:185:id:1056294dm2023185

DOI: 10.56294/dm2023185

Access Statistics for this article

More articles in Data and Metadata from AG Editor
Bibliographic data for series maintained by Javier Gonzalez-Argote ().

 
Page updated 2025-09-21
Handle: RePEc:dbk:datame:v:2:y:2023:i::p:185:id:1056294dm2023185