A Multi-Level Arabic Text Diacritization System
Ali Mijlad () and
Yacine El Younoussi ()
Additional contact information
Ali Mijlad: ENSATe, Abdelmalek Essaadi University, SIGL Laboratory
Yacine El Younoussi: ENSATe, Abdelmalek Essaadi University, SIGL Laboratory
A chapter in Technological Innovations for Sustainable Development, 2025, pp 17-27 from Springer
Abstract:
Abstract This paper presents a multi-level Arabic diacritization system designed to restore diacritics for undiacritized Arabic text. This kind of systems is crucial for Arabic-related NLP tasks and aids learners and individuals with learning difficulties, such as dyslexia or visual impairments. Our system uses a two-level approach: a word-based level and a letter-based level, both employing an encoder-decoder model with a local predictive Luong attention mechanism. The combined model demonstrates good performance with a 22.47% diacritic error rate, significantly surpassing single-level models while maintaining competitive performance despite using a smaller dataset compared to previous studies.
Keywords: Diacritization; Arabic; GRU; Encoder-Decoder Model; Luong Attention (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:lnichp:978-3-032-06725-8_2
Ordering information: This item can be ordered from
http://www.springer.com/9783032067258
DOI: 10.1007/978-3-032-06725-8_2
Access Statistics for this chapter
More chapters in Lecture Notes in Information Systems and Organization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().