Exploring Zero-Shot SLM Ensembles as an Alternative to LLMs for Sentiment Analysis
D. Cielen,
K. de Bock () and
L. Flores
Additional contact information
K. de Bock: Audencia Business School
Post-Print from HAL
Abstract:
Sentiment analysis has become vital for understanding consumer attitudes, guiding product development, and informing strategic decisions. Although LLMs such as GPT-3.5 and GPT-4 deliver strong zero-shot performance, they can be cost prohibitive and raise privacy concerns. In contrast, Small Language Models (SLMs) provide a lighter and more deployable solution, but their ability to match LLM accuracy, especially in zero-shot scenarios, remains underexplored. In this experimental study , we examine whether ensembles of zero-shot SLMs can serve as a viable alternative to proprietary LLMs in sentiment classification tasks. We investigate five commonly used SLMs (Phi2 Mini, Mistral, Llama, Gemma, Aya) and compare them to GPT-based models (GPT-3.5, GPT-4, GPT-4 omni, GPT-4 omni mini) across seven English-language datasets. By automating prompt generation and filtering responses based on a strict output format, we maintain a purely zero-shot approach. We form SLM ensembles via majority voting and evaluate their performance on accuracy, weighted precision, and weighted F1. We also measure inference time to assess cost and scalability trade-offs. Results show that SLM ensembles as a form of decision fusion, consistently outperform single SLMs, significantly boosting metrics in zero-shot settings. In contrast with GPT models, the ensemble achieves accuracy comparable to GPT-3.5 and even rivals GPT-4 on certain prompts. However, GPT-4 retains a slight edge in both precision and F1 score. Moreover, local SLM ensembles incur higher latency yet offer potential advantages in data privacy and operational control. This experimental study's findings illuminate the feasibility of employing lightweight, zero-shot SLM ensembles for sentiment analysis, providing organizations with an effective and more flexible alternative to exclusively relying on large proprietary models.
Keywords: Zero-shot learning; Sentiment Analysis; Small Language Models (SLMs); Large; Language Models (LLMs); Ensemble Methods; decision fusion (search for similar items in EconPapers)
Date: 2026-08
References: Add references at CitEc
Citations:
Published in Information Fusion, 2026, ⟨10.1016/j.inffus.2025.103666⟩
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:journl:hal-05249134
DOI: 10.1016/j.inffus.2025.103666
Access Statistics for this paper
More papers in Post-Print from HAL
Bibliographic data for series maintained by CCSD ().