In real-world dialogue systems, the ability to understand the user's emotions and interact anthropomorphically is of great significance. Emotion Recognition in Conversation (ERC) is one of the key ways to accomplish this goal and has attracted growing attention. How to model the context in a conversation is a central aspect and a major challenge of ERC tasks. Most existing approaches are generally unable to capture both global and local contextual information efficiently, and their network structures are too complex to design. For this reason, in this work, we propose a straightforward Dual-stream Recurrence-Attention Network (DualRAN) based on Recurrent Neural Network (RNN) and Multi-head ATtention network (MAT). The proposed model eschews the complex network structure of current methods and focuses on combining recurrence-based methods with attention-based methods. DualRAN is a dual-stream structure mainly consisting of local- and global-aware modules, modeling a conversation from distinct perspectives. To achieve the local-aware module, we extend the structure of RNN, thus enhancing the expressive capability of the network. In addition, we develop two single-stream network variants for DualRAN, i.e., SingleRANv1 and SingleRANv2. We conduct extensive experiments on four widely used benchmark datasets, and the results reveal that the proposed model outshines all baselines. Ablation studies further demonstrate the effectiveness of each component.
翻译:在现实对话系统中,理解用户情感并实现拟人化交互具有重要意义。对话情感识别是实现该目标的关键途径之一,并受到越来越多的关注。如何对对话中的上下文进行建模是情感识别任务的核心挑战。现有方法大多难以同时高效捕获全局和局部上下文信息,且网络结构过于复杂。为此,本文提出一种基于循环神经网络与多头注意力网络的简洁双流递归-注意力网络。该模型摒弃了当前方法复杂的网络结构,专注于融合递归方法与注意力方法。DualRAN采用双流结构,主要由局部感知模块和全局感知模块构成,从不同视角对对话进行建模。为实现局部感知模块,我们扩展了循环神经网络的结构,从而增强网络的表达能力。此外,我们还为DualRAN开发了两种单流网络变体,即SingleRANv1和SingleRANv2。在四个广泛使用的基准数据集上进行的大量实验表明,所提模型性能优于所有基线方法。消融研究进一步证明了各组成部分的有效性。