Emotion Recognition from Videos Using Multimodal Large Language Models

Vaiani, Lorenzo; Cagliero, Luca; Garza, Paolo

Emotion Recognition from Videos Using Multimodal Large Language Models

Lorenzo Vaiani, Luca Cagliero and Paolo Garza ()
Additional contact information
Lorenzo Vaiani: Dipartimento di Automatica e Informatica, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129 Torino, Italy
Luca Cagliero: Dipartimento di Automatica e Informatica, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129 Torino, Italy
Paolo Garza: Dipartimento di Automatica e Informatica, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129 Torino, Italy

Future Internet, 2024, vol. 16, issue 7, 1-17

Abstract: The diffusion of Multimodal Large Language Models (MLLMs) has opened new research directions in the context of video content understanding and classification. Emotion recognition from videos aims to automatically detect human emotions such as anxiety and fear. It requires deeply elaborating multiple data modalities, including acoustic and visual streams. State-of-the-art approaches leverage transformer-based architectures to combine multimodal sources. However, the impressive performance of MLLMs in content retrieval and generation offers new opportunities to extend the capabilities of existing emotion recognizers. This paper explores the performance of MLLMs in the emotion recognition task in a zero-shot learning setting. Furthermore, it presents a state-of-the-art architecture extension based on MLLM content reformulation. The performance achieved on the Hume-Reaction benchmark shows that MLLMs are still unable to outperform the state-of-the-art average performance but, notably, are more effective than traditional transformers in recognizing emotions with an intensity that deviates from the average of the samples.

Keywords: video–language large language models; emotion recognition; emotional reaction intensity estimation; multimodal learning (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/16/7/247/pdf (application/pdf)
https://www.mdpi.com/1999-5903/16/7/247/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:16:y:2024:i:7:p:247-:d:1434458

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().