In speech emotion recognition (SER), using predefined features without considering their practical importance may lead to high dimensional datasets, including redundant and irrelevant information. Consequently, high-dimensional learning often results in decreasing model accuracy while increasing computational complexity. Our work underlines the importance of carefully considering and analyzing features in order to build efficient SER systems. We present a new supervised SER method based on an efficient feature engineering approach. We pay particular attention to the explainability of results to evaluate feature relevance and refine feature sets. This is performed iteratively through feature evaluation loop, using Shapley values to boost feature selection and improve overall framework performance. Our approach allows thus to balance the benefits between model performance and transparency. The proposed method outperforms human-level performance (HLP) and state-of-the-art machine learning methods in emotion recognition on the TESS dataset. The source code of this paper is publicly available at https://github.com/alaaNfissi/Iterative-Feature-Boosting-for-Explainable-Speech-Emotion-Recognition.
翻译:在语音情感识别(SER)中,若直接使用预定义特征而不考虑其实际重要性,可能导致高维数据集包含冗余及无关信息。因此,高维学习常导致模型准确率下降,同时增加计算复杂度。本研究强调仔细考量与分析特征对于构建高效SER系统的重要性。我们提出一种基于高效特征工程方法的新型监督式SER方法,特别关注结果的可解释性以评估特征相关性并优化特征集。该方法通过特征评估循环迭代执行,利用沙普利值增强特征选择并提升整体框架性能,从而在模型性能与透明度之间取得平衡。所提方法在TESS数据集的情感识别任务中超越了人类水平表现(HLP)及当前最先进的机器学习方法。本文源代码已公开于https://github.com/alaaNfissi/Iterative-Feature-Boosting-for-Explainable-Speech-Emotion-Recognition。