Unveiling the Effectiveness of NLP-Based DL Methods for Urdu Text Analysis
Noman Tahir (),
Michal Nykl (),
Ondřej Pražák () and
Karel Ježek ()
Additional contact information
Noman Tahir: DCSE, University of West Bohemia
Michal Nykl: NTIS, University of West Bohemia
Ondřej Pražák: NTIS, University of West Bohemia
Karel Ježek: DCSE, University of West Bohemia
A chapter in Information Systems and Technological Advances for Sustainable Development, 2024, pp 102-113 from Springer
Abstract:
Abstract The analysis of text data has become a significant challenge while its size is gradually increasing in massive amounts. Various textual analysis methods exist, dealing with different processing styles due to multiple data types, mainly for English. Therefore, the other low-resource languages are difficult to process due to the unavailability of intelligent methods. Similarly, Urdu, as a low-resource language, requires effective methods based on machine learning or deep learning mechanisms. Our study has identified the rarely used pure Urdu text dataset, an effective combination of embeddings, and the best combination of hyperparameters for DL methods trained on that dataset. According to the evaluation results, our study has also determined the best methods regarding embeddings, hyperparameters, and overall performance. Moreover, combining pre-trained BERT embeddings with the fine-tuned BiLSTM and BERT was the best method to cope with Urdu as a low-resource language. As per the findings, our study recommends the pre-trained embedding models and hyperparameters settings for Urdu text classification analysis.
Keywords: NLP for Urdu; BERT; BiLSTM; Urdu text analysis; Deep learning for Urdu (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:lnichp:978-3-031-75329-9_12
Ordering information: This item can be ordered from
http://www.springer.com/9783031753299
DOI: 10.1007/978-3-031-75329-9_12
Access Statistics for this chapter
More chapters in Lecture Notes in Information Systems and Organization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().