Effective feature representations play a critical role in enhancing the performance of text generation models that rely on deep neural networks. However, current approaches suffer from several drawbacks, such as the inability to capture the deep semantics of language and sensitivity to minor input variations, resulting in significant changes in the generated text. In this paper, we present a novel solution to these challenges by employing a mixture of experts, multiple encoders, to offer distinct perspectives on the emotional state of the user's utterance while simultaneously enhancing performance. We propose an end-to-end model architecture called ASEM that performs emotion analysis on top of sentiment analysis for open-domain chatbots, enabling the generation of empathetic responses that are fluent and relevant. In contrast to traditional attention mechanisms, the proposed model employs a specialized attention strategy that uniquely zeroes in on sentiment and emotion nuances within the user's utterance. This ensures the generation of context-rich representations tailored to the underlying emotional tone and sentiment intricacies of the text. Our approach outperforms existing methods for generating empathetic embeddings, providing empathetic and diverse responses. The performance of our proposed model significantly exceeds that of existing models, enhancing emotion detection accuracy by 6.2% and lexical diversity by 1.4%.
翻译:有效的特征表示在提升基于深度神经网络的文本生成模型性能中起着关键作用。然而,现有方法存在若干缺陷,例如无法捕捉语言的深层语义以及对细微输入变化的敏感性,导致生成文本发生显著变化。本文针对这些挑战提出了一种新颖的解决方案,通过采用混合专家模型(即多个编码器),在提升性能的同时,对用户话语的情绪状态提供不同视角的观测。我们提出了一种名为ASEM的端到端模型架构,该架构在开放域聊天机器人的情感分析基础上执行情绪分析,从而生成流畅且相关的共情回应。不同于传统注意力机制,本模型采用了一种专门的注意力策略,能够独特地聚焦于用户话语中情感与情绪的细微差别。这确保了生成与文本潜在情绪基调及情感复杂性相匹配的、富含上下文的表示。我们的方法在生成共情嵌入方面优于现有方法,能够提供共情且多样化的回应。所提模型的性能显著超越现有模型,将情绪检测准确率提升了6.2%,词汇多样性提升了1.4%。