The lack of an available emotion pathology database is one of the key obstacles in studying the emotion expression status of patients with dysarthria. The first Chinese multimodal emotional pathological speech database containing multi-perspective information is constructed in this paper. It includes 29 controls and 39 patients with different degrees of motor dysarthria, expressing happy, sad, angry and neutral emotions. All emotional speech was labeled for intelligibility, types and discrete dimensional emotions by developed WeChat mini-program. The subjective analysis justifies from emotion discrimination accuracy, speech intelligibility, valence-arousal spatial distribution, and correlation between SCL-90 and disease severity. The automatic recognition tested on speech and glottal data, with average accuracy of 78% for controls and 60% for patients in audio, while 51% for controls and 38% for patients in glottal data, indicating an influence of the disease on emotional expression.
翻译:可用的情感病理语音数据库的缺乏是研究构音障碍患者情感表达状态的关键障碍之一。本文构建了首个包含多视角信息的汉语多模态情感病理语音数据库,包含29名对照组和39名不同程度运动性构音障碍患者,表达快乐、悲伤、愤怒和中性四种情感。所有情感语音均通过开发的微信小程序标注了可懂度、情感类型和离散维度情感。主观分析从情感辨别准确性、语音可懂度、效价-唤醒度空间分布以及SCL-90与疾病严重程度的相关性等方面进行了验证。基于语音和声门数据的自动识别测试中,音频数据的平均准确率对照组为78%、患者为60%,声门数据对照组为51%、患者为38%,表明疾病对情感表达存在影响。