The field of emotion recognition of conversation (ERC) has been focusing on separating sentence feature encoding and context modeling, lacking exploration in generative paradigms based on unified designs. In this study, we propose a novel approach, InstructERC, to reformulate the ERC task from a discriminative framework to a generative framework based on Large Language Models (LLMs). InstructERC makes three significant contributions: (1) it introduces a simple yet effective retrieval template module, which helps the model explicitly integrate multi-granularity dialogue supervision information. (2) We introduce two additional emotion alignment tasks, namely speaker identification and emotion prediction tasks, to implicitly model the dialogue role relationships and future emotional tendencies in conversations. (3) Pioneeringly, we unify emotion labels across benchmarks through the feeling wheel to fit real application scenarios. InstructERC still perform impressively on this unified dataset. Our LLM-based plugin framework significantly outperforms all previous models and achieves comprehensive SOTA on three commonly used ERC datasets. Extensive analysis of parameter-efficient and data-scaling experiments provides empirical guidance for applying it in practical scenarios.
翻译:对话情感识别(ERC)领域长期聚焦于语句特征编码与上下文建模的分离,缺乏基于统一设计的生成式范式探索。本研究提出创新方法InstructERC,将ERC任务从判别式框架重构为基于大语言模型(LLMs)的生成式框架。InstructERC作出三项重要贡献:(1)引入简洁高效的检索模板模块,帮助模型显式整合多粒度对话监督信息;(2)通过新增说话人识别与情感预测两个情感对齐任务,隐式建模对话角色关系及会话未来情感趋势;(3)开创性地通过情感轮统一跨基准数据集的情感标签以适应实际应用场景。在此统一数据集上,InstructERC仍表现卓越。我们基于LLM的插件框架显著超越所有现有模型,在三个常用ERC数据集上实现全面SOTA性能。参数高效性与数据规模扩展实验的深入分析为其实际应用提供了实证指导。