Emotion recognition is an important part of affective computing. Extracting emotional cues from human gaits yields benefits such as natural interaction, a nonintrusive nature, and remote detection. Recently, the introduction of self-supervised learning techniques offers a practical solution to the issues arising from the scarcity of labeled data in the field of gait-based emotion recognition. However, due to the limited diversity of gaits and the incompleteness of feature representations for skeletons, the existing contrastive learning methods are usually inefficient for the acquisition of gait emotions. In this paper, we propose a contrastive learning framework utilizing selective strong augmentation (SSA) for self-supervised gait-based emotion representation, which aims to derive effective representations from limited labeled gait data. First, we propose an SSA method for the gait emotion recognition task, which includes upper body jitter and random spatiotemporal mask. The goal of SSA is to generate more diverse and targeted positive samples and prompt the model to learn more distinctive and robust feature representations. Then, we design a complementary feature fusion network (CFFN) that facilitates the integration of cross-domain information to acquire topological structural and global adaptive features. Finally, we implement the distributional divergence minimization loss to supervise the representation learning of the generally and strongly augmented queries. Our approach is validated on the Emotion-Gait (E-Gait) and Emilya datasets and outperforms the state-of-the-art methods under different evaluation protocols.
翻译:情感识别是情感计算的重要组成部分。从人类步态中提取情感线索具有自然交互、非侵入性和远程检测等优势。近年来,自监督学习技术的引入为步态情感识别领域因标注数据稀缺带来的问题提供了实用解决方案。然而,由于步态多样性的局限以及骨架特征表示的不完整性,现有对比学习方法在获取步态情感特征方面通常效率较低。本文提出一种利用选择性强增强(SSA)的对比学习框架,用于自监督步态情感表示学习,旨在从有限的标注步态数据中提取有效表示。首先,我们针对步态情感识别任务提出SSA方法,该方法包含上半身抖动和随机时空掩码。SSA的目标是生成更多样化且更具针对性的正样本,促使模型学习更具区分性和鲁棒性的特征表示。其次,我们设计了一个互补特征融合网络(CFFN),促进跨域信息整合以获取拓扑结构特征和全局自适应特征。最后,我们采用分布散度最小化损失函数来监督通用增强和强增强查询的表示学习。该方法在Emotion-Gait(E-Gait)和Emilya数据集上进行了验证,在不同评估协议下均优于现有最佳方法。