Emotional well-being significantly influences mental health and overall quality of life. As therapy chatbots become increasingly prevalent, their ability to comprehend and respond empathetically to users' emotions remains limited. This paper addresses this limitation by proposing an approach to enhance therapy chatbots with auditory perception, enabling them to understand users' feelings and provide human-like empathy. The proposed method incorporates speech emotion recognition (SER) techniques using Convolutional Neural Network (CNN) models and the ShEMO dataset to accurately detect and classify negative emotions, including anger, fear, and sadness. The SER model achieves a validation accuracy of 88%, demonstrating its effectiveness in recognizing emotional states from speech signals. Furthermore, a recommender system is developed, leveraging the SER model's output to generate personalized recommendations for managing negative emotions, for which a new bilingual dataset was generated as well since there is no such dataset available for this task. The recommender model achieves an accuracy of 98% by employing a combination of global vectors for word representation (GloVe) and LSTM models. To provide a more immersive and empathetic user experience, a text-to-speech model called GlowTTS is integrated, enabling the therapy chatbot to audibly communicate the generated recommendations to users in both English and Persian. The proposed approach offers promising potential to enhance therapy chatbots by providing them with the ability to recognize and respond to users' emotions, ultimately improving the delivery of mental health support for both English and Persian-speaking users.
翻译:情绪健康显著影响心理健康与整体生活质量。尽管心理治疗聊天机器人日趋普及,其在理解与共情回应用户情绪方面的能力仍存在局限。本文针对该问题提出一种增强方案,通过为心理治疗聊天机器人赋予听觉感知能力,使其能够理解用户情绪并展现类人共情。该方法采用基于卷积神经网络(CNN)模型与ShEMO数据集的语音情感识别技术,实现对愤怒、恐惧、悲伤等负面情绪的精准检测与分类。该语音情感识别模型在验证集上达到88%的准确率,验证了其从语音信号中识别情绪状态的有效性。进一步地,本研究开发了推荐系统,利用语音情感识别模型的输出生成个性化建议以管理负面情绪。由于当前缺乏此类任务的相关数据集,研究团队为此新构建了双语数据集。通过融合全局词向量(GloVe)与LSTM模型,该推荐模型达到98%的准确率。为提供更具沉浸感与共情力的用户体验,本文集成了名为GlowTTS的语音合成模型,使心理治疗聊天机器人能够以英语和波斯语两种语言,将生成的建议通过语音形式传递给用户。本方案为心理治疗聊天机器人赋予了识别与回应用户情绪的能力,从而显著提升了面向英语及波斯语用户的心理健康支持服务质量。