Video Action Counting (VAC) is crucial in analyzing sports, fitness, and everyday activities by quantifying repetitive actions in videos. However, traditional VAC methods have overlooked the complexity of action repetitions, such as interruptions and the variability in cycle duration. Our research addresses the shortfall by introducing a novel approach to VAC, called Irregular Video Action Counting (IVAC). IVAC prioritizes modeling irregular repetition patterns in videos, which we define through two primary aspects: Inter-cycle Consistency and Cycle-interval Inconsistency. Inter-cycle Consistency ensures homogeneity in the spatial-temporal representations of cycle segments, signifying action uniformity within cycles. Cycle-interval inconsistency highlights the importance of distinguishing between cycle segments and intervals based on their inherent content differences. To encapsulate these principles, we propose a new methodology that includes consistency and inconsistency modules, supported by a unique pull-push loss (P2L) mechanism. The IVAC-P2L model applies a pull loss to promote coherence among cycle segment features and a push loss to clearly distinguish features of cycle segments from interval segments. Empirical evaluations conducted on the RepCount dataset demonstrate that the IVAC-P2L model sets a new benchmark in VAC task performance. Furthermore, the model demonstrates exceptional adaptability and generalization across various video contents, outperforming existing models on two additional datasets, UCFRep and Countix, without the need for dataset-specific optimization. These results confirm the efficacy of our approach in addressing irregular repetitions in videos and pave the way for further advancements in video analysis and understanding.
翻译:视频动作计数(VAC)通过量化视频中的重复动作,在分析体育、健身及日常活动中至关重要。然而,传统VAC方法忽视了动作重复的复杂性,例如动作中断和周期持续时间的变化。本研究针对这一不足,提出了一种名为不规则视频动作计数(IVAC)的新方法。IVAC优先对视频中的不规则重复模式进行建模,通过两个核心方面定义该模式:周期内一致性与周期间隔不一致性。周期内一致性确保周期片段在时空表征上的同质性,表征动作在周期内的统一性;周期间隔不一致性则强调基于内容本质差异区分周期片段与间隔片段的重要性。为涵盖上述原则,我们提出了一种新方法论,包含一致性模块与不一致性模块,并由独特的推拉损失(P2L)机制支撑。IVAC-P2L模型通过拉近损失促进周期片段特征间的内聚性,通过推远损失清晰区分周期片段与间隔片段的特征。在RepCount数据集上的经验评估表明,IVAC-P2L模型在VAC任务性能上树立了新标杆。此外,该模型在各类视频内容中展现出卓越的适应性与泛化能力,在UCFRep与Countix两个额外数据集上无需针对数据集进行优化即可超越现有模型。这些结果证实了本方法解决视频中不规则重复问题的有效性,并为视频分析与理解的进一步发展开辟了道路。