This paper presents CAMEO -- a curated collection of multilingual emotional speech datasets designed to facilitate research in emotion recognition and other speech-related tasks. The main objectives were to ensure easy access to the data, to allow reproducibility of the results, and to provide a standardized benchmark for evaluating speech emotion recognition (SER) systems across different emotional states and languages. The paper describes the dataset selection criteria, the curation and normalization process, and provides performance results for several models. The collection, along with metadata, and a leaderboard, is publicly available via the Hugging Face platform.
翻译:本文介绍CAMEO——一个精心策划的多语言情感语音数据集集合,旨在促进情感识别及其他语音相关任务的研究。其主要目标是确保数据的易获取性、保证结果的可复现性,并为评估跨不同情感状态和语言的语音情感识别系统提供标准化基准。本文详细描述了数据集的选择标准、数据整理与标准化流程,并提供了多个模型的性能结果。该集合及其元数据和排行榜已通过Hugging Face平台公开提供。