Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition

Micro-Action Recognition (MAR) has gained increasing attention due to its crucial role as a form of non-verbal communication in social interactions, with promising potential for applications in human communication and emotion analysis. However, current approaches often overlook the inherent ambiguity in micro-actions, which arises from the wide category range and subtle visual differences between categories. This oversight hampers the accuracy of micro-action recognition. In this paper, we propose a novel Prototypical Calibrating Ambiguous Network (\textbf{PCAN}) to unleash and mitigate the ambiguity of MAR. \textbf{Firstly}, we employ a hierarchical action-tree to identify the ambiguous sample, categorizing them into distinct sets of ambiguous samples of false negatives and false positives, considering both body- and action-level categories. \textbf{Secondly}, we implement an ambiguous contrastive refinement module to calibrate these ambiguous samples by regulating the distance between ambiguous samples and their corresponding prototypes. This calibration process aims to pull false negative ($\mathbb{FN}$) samples closer to their respective prototypes and push false positive ($\mathbb{FP}$) samples apart from their affiliated prototypes. In addition, we propose a new prototypical diversity amplification loss to strengthen the model's capacity by amplifying the differences between different prototypes. \textbf{Finally}, we propose a prototype-guided rectification to rectify prediction by incorporating the representability of prototypes. Extensive experiments conducted on the benchmark dataset demonstrate the superior performance of our method compared to existing approaches. The code is available at https://github.com/kunli-cs/PCAN.

翻译：微动作识别作为一种重要的非语言交流形式，在社会互动中日益受到关注，在人际沟通与情感分析领域具有广阔的应用前景。然而，现有方法往往忽视了微动作固有的模糊性，这种模糊性源于广泛的类别范围以及类别间细微的视觉差异，从而制约了识别精度的提升。本文提出一种新颖的原型校准模糊网络（PCAN）以揭示并缓解微动作识别的模糊性问题。首先，我们采用分层动作树识别模糊样本，结合身体层面与动作层面的类别信息，将其归类为假阴性样本与假阳性样本两类不同的模糊样本集合。其次，我们设计了一个模糊对比优化模块，通过调节模糊样本与其对应原型之间的距离来校准这些样本：该过程旨在拉近假阴性样本与其所属原型的距离，同时推远假阳性样本与其关联原型的距离。此外，我们提出一种原型多样性增强损失函数，通过放大不同原型间的差异来增强模型的表征能力。最后，我们提出原型引导的预测校正机制，通过融合原型的表征能力对预测结果进行修正。在基准数据集上的大量实验表明，本方法相较于现有方法具有更优越的性能。代码已开源：https://github.com/kunli-cs/PCAN。