EconPapers    
Economics at your fingertips  
 

Multi-Corpus Learning for Audio–Visual Emotions and Sentiment Recognition

Elena Ryumina (), Maxim Markitantov and Alexey Karpov
Additional contact information
Elena Ryumina: St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 199178 St. Petersburg, Russia
Maxim Markitantov: St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 199178 St. Petersburg, Russia
Alexey Karpov: St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 199178 St. Petersburg, Russia

Mathematics, 2023, vol. 11, issue 16, 1-22

Abstract: Recognition of emotions and sentiment (affective states) from human audio–visual information is widely used in healthcare, education, entertainment, and other fields; therefore, it has become a highly active research area. The large variety of corpora with heterogeneous data available for the development of single-corpus approaches for recognition of affective states may lead to approaches trained on one corpus being less effective on another. In this article, we propose a multi-corpus learned audio–visual approach for emotion and sentiment recognition. It is based on the extraction of mid-level features at the segment level using two multi-corpus temporal models (a pretrained transformer with GRU layers for the audio modality and pre-trained 3D CNN with BiLSTM-Former for the video modality) and on predicting affective states using two single-corpus cross-modal gated self-attention fusion (CMGSAF) models. The proposed approach was tested on the RAMAS and CMU-MOSEI corpora. To date, our approach has outperformed state-of-the-art audio–visual approaches for emotion recognition by 18.2% (78.1% vs. 59.9%) for the CMU-MOSEI corpus in terms of the Weighted Accuracy and by 0.7% (82.8% vs. 82.1%) for the RAMAS corpus in terms of the Unweighted Average Recall.

Keywords: audio–visual-based affective states recognition; emotion recognition; sentiment recognition; gated modality fusion; self-attention fusion; multi-corpus learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/16/3519/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/16/3519/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:16:p:3519-:d:1217520

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:11:y:2023:i:16:p:3519-:d:1217520