Human activity recognition (HAR) is a crucial area of research that involves understanding human movements using computer and machine vision technology. Deep learning has emerged as a powerful tool for this task, with models such as Convolutional Neural Networks (CNNs) and Transformers being employed to capture various aspects of human motion. One of the key contributions of this work is the demonstration of the effectiveness of feature fusion in improving HAR accuracy by capturing spatial and temporal features, which has important implications for the development of more accurate and robust activity recognition systems. The study uses sensory data from HuGaDB, PKU-MMD, LARa, and TUG datasets. Two model, the PO-MS-GCN and a Transformer were trained and evaluated, with PO-MS-GCN outperforming state-of-the-art models. HuGaDB and TUG achieved high accuracies and f1-scores, while LARa and PKU-MMD had lower scores. Feature fusion improved results across datasets.
翻译:人体活动识别(HAR)是利用计算机与机器视觉技术理解人体运动的关键研究领域。深度学习已成为此项任务的有力工具,诸如卷积神经网络(CNN)和Transformer等模型被用于捕捉人体运动的不同方面。本研究的一项核心贡献在于,通过融合空间与时间特征以提升HAR准确率,论证了特征融合的有效性,这对开发更精准、鲁棒的活动识别系统具有重要意义。研究使用了来自HuGaDB、PKU-MMD、LARa和TUG数据集的传感器数据。对PO-MS-GCN和Transformer两种模型进行了训练与评估,其中PO-MS-GCN的性能超越了现有先进模型。HuGaDB和TUG数据集取得了较高的准确率与F1分数,而LARa和PKU-MMD的分数相对较低。特征融合在所有数据集上均提升了识别效果。