Emotion recognition from multi-modal physiological and behavioral signals plays a pivotal role in affective computing, yet most existing models remain constrained to the prediction of singular emotions in controlled laboratory settings. Real-world human emotional experiences, by contrast, are often characterized by the simultaneous presence of multiple affective states, spurring recent interest in mixed emotion recognition as an emotion distribution learning problem. Current approaches, however, often neglect the valence consistency and structured correlations inherent among coexisting emotions. To address this limitation, we propose a Memory-guided Prototypical Co-occurrence Learning (MPCL) framework that explicitly models emotion co-occurrence patterns. Specifically, we first fuse multi-modal signals via a multi-scale associative memory mechanism. To capture cross-modal semantic relationships, we construct emotion-specific prototype memory banks, yielding rich physiological and behavioral representations, and employ prototype relation distillation to ensure cross-modal alignment in the latent prototype space. Furthermore, inspired by human cognitive memory systems, we introduce a memory retrieval strategy to extract semantic-level co-occurrence associations across emotion categories. Through this bottom-up hierarchical abstraction process, our model learns affectively informative representations for accurate emotion distribution prediction. Comprehensive experiments on two public datasets demonstrate that MPCL consistently outperforms state-of-the-art methods in mixed emotion recognition, both quantitatively and qualitatively.
翻译:基于多模态生理与行为信号的情绪识别在情感计算中具有关键作用,然而现有模型大多局限于在受控实验室环境下预测单一情绪。相比之下,真实世界中的人类情绪体验往往以多种情感状态同时存在为特征,这促使混合情绪识别作为情绪分布学习问题受到近期关注。然而,现有方法通常忽视共存情绪之间固有的效价一致性与结构化关联。为克服这一局限,我们提出一种记忆引导的原型共现学习框架,显式建模情绪共现模式。具体而言,我们首先通过多尺度关联记忆机制融合多模态信号。为捕捉跨模态语义关系,我们构建情绪特定的原型记忆库,生成丰富的生理与行为表征,并采用原型关系蒸馏确保潜在原型空间中的跨模态对齐。此外,受人类认知记忆系统启发,我们引入记忆检索策略以提取跨情绪类别的语义级共现关联。通过这种自底向上的层次抽象过程,我们的模型学习到具有情感信息性的表征,以实现精准的情绪分布预测。在两个公开数据集上的综合实验表明,MPCL在定量与定性评估中均持续优于混合情绪识别领域的现有先进方法。