We introduce a novel state-space model (SSM)-based framework for skeleton-based human action recognition, with an anatomically-guided architecture that improves state-of-the-art performance in both clinical diagnostics and general action recognition tasks. Our approach decomposes skeletal motion analysis into spatial, temporal, and spatio-temporal streams, using channel partitioning to capture distinct movement characteristics efficiently. By implementing a structured, multi-directional scanning strategy within SSMs, our model captures local joint interactions and global motion patterns across multiple anatomical body parts. This anatomically-aware decomposition enhances the ability to identify subtle motion patterns critical in medical diagnosis, such as gait anomalies associated with neurological conditions. On public action recognition benchmarks, i.e., NTU RGB+D, NTU RGB+D 120, and NW-UCLA, our model outperforms current state-of-the-art methods, achieving accuracy improvements up to $3.2\%$ with lower computational complexity than previous leading transformer-based models. We also introduce a novel medical dataset for motion-based patient neurological disorder analysis to validate our method's potential in automated disease diagnosis.
翻译:我们提出了一种基于状态空间模型(SSM)的新型框架,用于基于骨架的人体动作识别。该框架采用解剖学引导的架构,在临床诊断和通用动作识别任务中均提升了当前最先进的性能。我们的方法将骨骼运动分析分解为空间、时间和时空流,利用通道划分来高效捕捉不同的运动特征。通过在SSM中实施结构化的多方向扫描策略,我们的模型能够捕捉多个解剖身体部位的局部关节交互和全局运动模式。这种具有解剖学意识的分解增强了识别细微运动模式的能力,这些模式在医学诊断中至关重要,例如与神经系统疾病相关的步态异常。在公开的动作识别基准测试(即NTU RGB+D、NTU RGB+D 120和NW-UCLA)上,我们的模型优于当前最先进的方法,在比先前领先的基于Transformer的模型计算复杂度更低的情况下,准确率提升高达$3.2\%$。我们还引入了一个新颖的基于运动的患者神经系统疾病分析医学数据集,以验证我们的方法在自动化疾病诊断中的潜力。