To integrate action recognition methods into autonomous robotic systems, it is crucial to consider adverse situations involving target occlusions. Such a scenario, despite its practical relevance, is rarely addressed in existing self-supervised skeleton-based action recognition methods. To empower robots with the capacity to address occlusion, we propose a simple and effective method. We first pre-train using occluded skeleton sequences, then use k-means clustering (KMeans) on sequence embeddings to group semantically similar samples. Next, we employ K-nearest-neighbor (KNN) to fill in missing skeleton data based on the closest sample neighbors. Imputing incomplete skeleton sequences to create relatively complete sequences as input provides significant benefits to existing skeleton-based self-supervised models. Meanwhile, building on the state-of-the-art Partial Spatio-Temporal Learning (PSTL), we introduce an Occluded Partial Spatio-Temporal Learning (OPSTL) framework. This enhancement utilizes Adaptive Spatial Masking (ASM) for better use of high-quality, intact skeletons. The effectiveness of our imputation methods is verified on the challenging occluded versions of the NTURGB+D 60 and NTURGB+D 120. The source code will be made publicly available at https://github.com/cyfml/OPSTL.
翻译:将动作识别方法集成至自主机器人系统时,必须考虑目标遮挡等不利场景。尽管具有实际应用价值,现有基于骨架的自监督动作识别方法却鲜少涉及此类情况。为使机器人具备处理遮挡的能力,我们提出一种简洁有效的方法。首先利用遮挡骨架序列进行预训练,随后对序列嵌入执行K均值聚类以分组语义相似样本;接着采用K近邻算法基于最近样本补全缺失骨架数据。通过填充不完整骨架序列生成相对完整的输入序列,可为现有基于骨架的自监督模型带来显著性能提升。同时,在最新进展——部分时空学习(PSTL)框架基础上,我们提出遮挡部分时空学习(OPSTL)框架。该改进通过自适应空间掩码(ASM)更高效地利用高质量完整骨架。所提填充方法在具有挑战性的遮挡版NTURGB+D 60与NTURGB+D 120数据集上验证了有效性。源代码将发布于 https://github.com/cyfml/OPSTL。