EconPapers    
Economics at your fingertips  
 

Multilingual thematic modeling: A comparative study of classical and transformational approaches

Aizhan Nazyrova (), Aikerim Nasrullayeva (), Assel Mukanova (), Aigerim Buribayeva () and Banu Yergesh ()

International Journal of Innovative Research and Scientific Studies, 2025, vol. 8, issue 6, 2787-2799

Abstract: This study aims to conduct a comparative evaluation of classical and transformer-based sentiment analysis models applied to Kazakh-Russian bilingual texts, addressing the gap in resource-efficient NLP solutions for low-resource languages. Three models were implemented and evaluated: (1) Word2Vec with a two-layer neural network, (2) BERT (rubert-base-cased), and (3) DistilBERT (distilrubert-tiny). A balanced dataset of 226,000 bilingual comments was used. The models were compared using key performance indicators, including F1-score, accuracy, computational efficiency, inference speed, model size, and energy consumption. Results show that BERT achieved the highest accuracy (F1 = 0.90), but with significant computational and memory costs. DistilBERT provided nearly identical accuracy (F1 = 0.89) with substantially reduced resource requirements, while Word2Vec achieved lower accuracy (F1 = 0.81) but demonstrated superior speed and energy efficiency. Error analysis revealed consistent challenges across models in handling negation, sarcasm, idiomatic expressions, and code-mixed language. The findings confirm that lightweight transformer models, particularly DistilBERT, provide a favorable trade-off between accuracy and efficiency. Word2Vec remains a viable option for real-time and embedded applications, while BERT, although accurate, is less practical for resource-constrained environments. This study contributes to the advancement of Green AI principles by demonstrating how efficient sentiment analysis systems can be developed for low-resource languages. The proposed dataset and evaluation framework can serve as a benchmark for future Kazakh-Russian NLP research and practical applications, including mobile services, e-Government platforms, and education technologies.

Keywords: DistilBERT; Efficiency; Green AI; Semantic analysis; Sentiment analysis; Sustainability; NLP; Word2Vec; BERT. (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://ijirss.com/index.php/ijirss/article/view/10204/2378 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:aac:ijirss:v:8:y:2025:i:6:p:2787-2799:id:10204

Access Statistics for this article

International Journal of Innovative Research and Scientific Studies is currently edited by Natalie Jean

More articles in International Journal of Innovative Research and Scientific Studies from Innovative Research Publishing
Bibliographic data for series maintained by Natalie Jean ().

 
Page updated 2025-09-25
Handle: RePEc:aac:ijirss:v:8:y:2025:i:6:p:2787-2799:id:10204