Gesture recognition is an important research area in the field of computer vision. Most gesture recognition efforts focus on close-set scenarios, thereby limiting the capacity to effectively handle unseen or novel gestures. We aim to address class-incremental gesture recognition, which entails the ability to accommodate new and previously unseen gestures over time. Specifically, we introduce a Prototype-Guided Pseudo Feature Replay (PGPFR) framework for data-free class-incremental gesture recognition. This framework comprises four components: Pseudo Feature Generation with Batch Prototypes (PFGBP), Variational Prototype Replay (VPR) for old classes, Truncated Cross-Entropy (TCE) for new classes, and Continual Classifier Re-Training (CCRT). To tackle the issue of catastrophic forgetting, the PFGBP dynamically generates a diversity of pseudo features in an online manner, leveraging class prototypes of old classes along with batch class prototypes of new classes. Furthermore, the VPR enforces consistency between the classifier's weights and the prototypes of old classes, leveraging class prototypes and covariance matrices to enhance robustness and generalization capabilities. The TCE mitigates the impact of domain differences of the classifier caused by pseudo features. Finally, the CCRT training strategy is designed to prevent overfitting to new classes and ensure the stability of features extracted from old classes. Extensive experiments conducted on two widely used gesture recognition datasets, namely SHREC 2017 3D and EgoGesture 3D, demonstrate that our approach outperforms existing state-of-the-art methods by 11.8\% and 12.8\% in terms of mean global accuracy, respectively. The code is available on https://github.com/sunao-101/PGPFR-3/.
翻译:手势识别是计算机视觉领域的一个重要研究方向。大多数手势识别工作集中于封闭集场景,从而限制了有效处理未知或新颖手势的能力。本文旨在解决类增量手势识别问题,该问题要求系统能够随时间适应新的、先前未见的手势。具体而言,我们提出了一种用于无数据类增量手势识别的原型引导伪特征回放框架。该框架包含四个组件:基于批量原型的伪特征生成、用于旧类的变分原型回放、用于新类的截断交叉熵以及持续分类器再训练。为应对灾难性遗忘问题,PFGBP利用旧类的类原型以及新类的批量类原型,以在线方式动态生成多样化的伪特征。此外,VPR通过利用类原型和协方差矩阵,强制分类器权重与旧类原型之间的一致性,从而增强鲁棒性和泛化能力。TCE减轻了伪特征引起的分类器域差异影响。最后,CCRT训练策略旨在防止对新类的过拟合并确保从旧类提取特征的稳定性。在两个广泛使用的手势识别数据集(即SHREC 2017 3D和EgoGesture 3D)上进行的大量实验表明,我们的方法在平均全局准确率方面分别优于现有最先进方法11.8%和12.8%。代码可在https://github.com/sunao-101/PGPFR-3/获取。