A Two-stage Approach for Word Searching in Handwritten Document Images
Ankur Goyal,
Pronita Mukherjee,
Dipra Mitra,
Shiv Kant,
Khalid Almalki and
Suliman Mohamed Fati
Data and Metadata, 2025, vol. 4, 54
Abstract:
Introduction; Despite the rise of electronic papers, handwritten paper documents remain important. Current technologies make document digitization, storage, compression, and transmission easy and affordable. But semi-automatic document image processing needs specific technology to extract document information accurately. Typed textual searches are used to get information from Digital Libraries. Objective; Generally, in a document, there exists a varying number of characters in different words. That is why searching a word in a whole document is incorporate mismatched word images in the fetched word image and also increases the time consumption to complete the task. Method; Keeping this idea in mind, the words having different number of characters with respect to the search word are discarded at the beginning as preprocessing. Result; To confirm the outstanding words in the document page as probable search word, a voting-based approach has been used for doing this, a modified HOG feature descriptor is extracted from each word image, then 5 distance-matching metrics are calculated, fed to a voting schema with the help of threshold value of each metrics, calculated beforehand. Conclusion; Here 3 types of voting is performed, first 2, with the varying no of metrics vote for positivity of the search word and in the last one three distance metrics are used among which if more than one votes for the positivity the model will indicate the word as a search word.
Date: 2025
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:dbk:datame:v:4:y:2025:i::p:54:id:1056294dm202554
DOI: 10.56294/dm202554
Access Statistics for this article
More articles in Data and Metadata from AG Editor
Bibliographic data for series maintained by Javier Gonzalez-Argote ().