Semi-supervised action segmentation aims to perform frame-wise classification in long untrimmed videos, where only a fraction of videos in the training set have labels. Recent studies have shown the potential of contrastive learning in unsupervised representation learning using unlabelled data. However, learning the representation of each frame by unsupervised contrastive learning for action segmentation remains an open and challenging problem. In this paper, we propose a novel Semantic-guided Multi-level Contrast scheme with a Neighbourhood-Consistency-Aware unit (SMC-NCA) to extract strong frame-wise representations for semi-supervised action segmentation. Specifically, for representation learning, SMC is firstly used to explore intra- and inter-information variations in a unified and contrastive way, based on dynamic clustering process of the original input, encoded semantic and temporal features. Then, the NCA module, which is responsible for enforcing spatial consistency between neighbourhoods centered at different frames to alleviate over-segmentation issues, works alongside SMC for semi-supervised learning. Our SMC outperforms the other state-of-the-art methods on three benchmarks, offering improvements of up to 17.8% and 12.6% in terms of edit distance and accuracy, respectively. Additionally, the NCA unit results in significant better segmentation performance against the others in the presence of only 5% labelled videos. We also demonstrate the effectiveness of the proposed method on our Parkinson's Disease Mouse Behaviour (PDMB) dataset. The code and datasets will be made publicly available.
翻译:半监督动作分割旨在对长未修剪视频进行逐帧分类,其中训练集中仅包含少量带标签视频。近期研究展示了对比学习在利用无标签数据进行无监督表示学习中的潜力。然而,通过无监督对比学习为动作分割任务学习每帧表示仍是一个开放且具有挑战性的问题。本文提出了一种新颖的语义引导多级对比方案,结合邻域一致性感知单元(SMC-NCA),为半监督动作分割提取强鲁棒的逐帧表示。具体而言,在表示学习中,首先基于原始输入、编码语义及时序特征的动态聚类过程,采用SMC以统一且对比的方式探索帧内与帧间信息变化。随后,负责强制不同帧中心邻域间空间一致性以缓解过分割问题的NCA模块,与SMC协同工作实现半监督学习。在三个基准数据集上,我们的SMC方法在编辑距离和准确率指标上分别取得了高达17.8%和12.6%的改进,优于其他现有方法。此外,在仅使用5%带标签视频的条件下,NCA单元相比其他方法显著提升了分割性能。我们还在帕金森病小鼠行为(PDMB)数据集上验证了所提方法的有效性。相关代码和数据集将公开发布。