Emotion recognition in conversations (ERC) is a crucial task for building human-like conversational agents. While substantial efforts have been devoted to ERC for chit-chat dialogues, the task-oriented counterpart is largely left unattended. Directly applying chit-chat ERC models to task-oriented dialogues (ToDs) results in suboptimal performance as these models overlook key features such as the correlation between emotions and task completion in ToDs. In this paper, we propose a framework that turns a chit-chat ERC model into a task-oriented one, addressing three critical aspects: data, features and objective. First, we devise two ways of augmenting rare emotions to improve ERC performance. Second, we use dialogue states as auxiliary features to incorporate key information from the goal of the user. Lastly, we leverage a multi-aspect emotion definition in ToDs to devise a multi-task learning objective and a novel emotion-distance weighted loss function. Our framework yields significant improvements for a range of chit-chat ERC models on EmoWOZ, a large-scale dataset for user emotion in ToDs. We further investigate the generalisability of the best resulting model to predict user satisfaction in different ToD datasets. A comparison with supervised baselines shows a strong zero-shot capability, highlighting the potential usage of our framework in wider scenarios.
翻译:对话中的情绪识别(ERC)是构建类人对话智能体的关键任务。尽管在闲聊式对话的情绪识别方面已有大量研究,但面向任务型对话的对应研究却鲜有涉及。将闲聊式ERC模型直接应用于任务型对话(ToDs)会导致性能欠佳,因为这些模型忽略了ToDs中情绪与任务完成度之间的关联等关键特征。本文提出一个将闲聊式ERC模型转化为任务型对话模型的框架,针对数据、特征和目标三个关键维度进行优化。首先,我们设计了两种增强稀缺情绪的方法以提升ERC性能。其次,利用对话状态作为辅助特征,融入用户目标中的关键信息。最后,我们基于ToDs中多维度情绪定义,构建了多任务学习目标及新型情绪距离加权损失函数。该框架在EmoWOZ(面向任务型对话中用户情绪的大规模数据集)上,使多种闲聊式ERC模型获得显著性能提升。我们进一步探究了最优模型在不同ToD数据集上预测用户满意度的泛化能力,与有监督基线方法的对比表明其具有强大的零样本能力,突显了本框架在更广泛场景中的应用潜力。