InfoGCN++: Learning Representation by Predicting the Future for Online Human Skeleton-based Action Recognition

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Skeleton-based action recognition has made significant advancements recently, with models like InfoGCN showcasing remarkable accuracy. However, these models exhibit a key limitation: they necessitate complete action observation prior to classification, which constrains their applicability in real-time situations such as surveillance and robotic systems. To overcome this barrier, we introduce InfoGCN++, an innovative extension of InfoGCN, explicitly developed for online skeleton-based action recognition. InfoGCN++ augments the abilities of the original InfoGCN model by allowing real-time categorization of action types, independent of the observation sequence's length. It transcends conventional approaches by learning from current and anticipated future movements, thereby creating a more thorough representation of the entire sequence. Our approach to prediction is managed as an extrapolation issue, grounded on observed actions. To enable this, InfoGCN++ incorporates Neural Ordinary Differential Equations, a concept that lets it effectively model the continuous evolution of hidden states. Following rigorous evaluations on three skeleton-based action recognition benchmarks, InfoGCN++ demonstrates exceptional performance in online action recognition. It consistently equals or exceeds existing techniques, highlighting its significant potential to reshape the landscape of real-time action recognition applications. Consequently, this work represents a major leap forward from InfoGCN, pushing the limits of what's possible in online, skeleton-based action recognition. The code for InfoGCN++ is publicly available at https://github.com/stnoah1/infogcn2 for further exploration and validation.

翻译：基于骨架的动作识别近期取得了显著进展，诸如InfoGCN等模型展现了卓越的准确性。然而，这些模型存在一个关键局限：它们在分类前需要完整观察动作过程，这限制了其在监控系统和机器人系统等实时场景中的应用。为突破这一障碍，我们提出InfoGCN++——作为InfoGCN的创新扩展，专为在线骨架动作识别而设计。InfoGCN++通过允许独立于观察序列长度进行实时动作类型分类，增强了原始InfoGCN模型的能力。它超越传统方法，通过从当前动作和预测的未来动作中学习，构建对整个序列更全面的表征。我们将预测视为基于已观测动作的外推问题。为此，InfoGCN++引入神经常微分方程（Neural Ordinary Differential Equations），使其能够有效建模隐藏状态的连续演化。经过在三个骨架动作识别基准上的严格评估，InfoGCN++在在线动作识别中展现出卓越性能。它持续达到或超越现有技术，凸显其重塑实时动作识别应用格局的巨大潜力。因此，本工作标志着从InfoGCN的重大突破，推动了在线骨架动作识别领域的可能性边界。InfoGCN++的代码已在https://github.com/stnoah1/infogcn2 公开，供进一步探索与验证。