Review and Comparative Analysis of Databases for Speech Emotion Recognition

Serrano, Salvatore; Serghini, Omar; Esposito, Giulia; Carbone, Silvia; Mento, Carmela; Floris, Alessandro; Porcu, Simone; Atzori, Luigi

Review and Comparative Analysis of Databases for Speech Emotion Recognition

Salvatore Serrano (), Omar Serghini, Giulia Esposito, Silvia Carbone, Carmela Mento, Alessandro Floris, Simone Porcu and Luigi Atzori
Additional contact information
Salvatore Serrano: Laboratory of Digital Signal Processing, Department of Engineering, University of Messina, 98122 Messina, Italy
Omar Serghini: Laboratory of Digital Signal Processing, Department of Engineering, University of Messina, 98122 Messina, Italy
Giulia Esposito: Laboratory of Digital Signal Processing, Department of Engineering, University of Messina, 98122 Messina, Italy
Silvia Carbone: Dipartimento di Scienze Politiche e Giuridiche, University of Messina, 98122 Messina, Italy
Carmela Mento: Department of Biomedical and Dental Sciences and Morphofunctional Imaging, University of Messina, Via Consolare Valeria, 1, 98125 Messina, Italy
Alessandro Floris: Department of Electrical and Electronic Engineering, University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy
Simone Porcu: Department of Electrical and Electronic Engineering, University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy
Luigi Atzori: Department of Electrical and Electronic Engineering, University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy

Data, 2025, vol. 10, issue 10, 1-58

Abstract: Speech emotion recognition (SER) has become increasingly important in areas such as healthcare, customer service, robotics, and human–computer interaction. The progress of this field depends not only on advances in algorithms but also on the databases that provide the training material for SER systems. These resources set the boundaries for how well models can generalize across speakers, contexts, and cultures. In this paper, we present a narrative review and comparative analysis of emotional speech corpora released up to mid-2025, bringing together both psychological and technical perspectives. Rather than following a systematic review protocol, our approach focuses on providing a critical synthesis of more than fifty corpora covering acted, elicited, and natural speech. We examine how these databases were collected, how emotions were annotated, their demographic diversity, and their ecological validity, while also acknowledging the limits of available documentation. Beyond description, we identify recurring strengths and weaknesses, highlight emerging gaps, and discuss recent usage patterns to offer researchers both a practical guide for dataset selection and a critical perspective on how corpus design continues to shape the development of robust and generalizable SER systems.

Keywords: corpus analysis; emotion modeling; emotional speech databases; speech emotion recognition (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2306-5729/10/10/164/pdf (application/pdf)
https://www.mdpi.com/2306-5729/10/10/164/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:10:y:2025:i:10:p:164-:d:1771204

Access Statistics for this article

Data is currently edited by Ms. Becky Zhang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().