Skeleton-based action recognition relies on the extraction of spatial-temporal topological information. Hypergraphs can establish prior unnatural dependencies for the skeleton. However, the existing methods only focus on the construction of spatial topology and ignore the time-point dependence. This paper proposes a dynamic spatial-temporal hypergraph convolutional network (DST-HCN) to capture spatial-temporal information for skeleton-based action recognition. DST-HCN introduces a time-point hypergraph (TPH) to learn relationships at time points. With multiple spatial static hypergraphs and dynamic TPH, our network can learn more complete spatial-temporal features. In addition, we use the high-order information fusion module (HIF) to fuse spatial-temporal information synchronously. Extensive experiments on NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets show that our model achieves state-of-the-art, especially compared with hypergraph methods.
翻译:基于骨架的动作识别依赖于时空拓扑信息的提取。超图可以为骨架建立先天的非自然依赖关系。然而,现有方法仅关注空间拓扑的构建,忽略了时间点依赖。本文提出一种动态时空超图卷积网络(DST-HCN),用于捕捉基于骨架的动作识别中的时空信息。DST-HCN引入时间点超图(TPH)来学习时间点之间的关联。通过结合多个空间静态超图和动态TPH,我们的网络能够学习更完整的时空特征。此外,我们采用高阶信息融合模块(HIF)同步融合时空信息。在NTU RGB+D、NTU RGB+D 120和NW-UCLA数据集上的大量实验表明,我们的模型达到了最先进的性能,尤其是与超图方法相比。