In this paper, we tackle the new task of video-based Activated Muscle Group Estimation (AMGE) aiming at identifying active muscle regions during physical activity in the wild. To this intent, we provide the MuscleMap dataset featuring >15K video clips with 135 different activities and 20 labeled muscle groups. This dataset opens the vistas to multiple video-based applications in sports and rehabilitation medicine under flexible environment constraints. The proposed MuscleMap dataset is constructed with YouTube videos, specifically targeting High-Intensity Interval Training (HIIT) physical exercise in the wild. To make the AMGE model applicable in real-life situations, it is crucial to ensure that the model can generalize well to numerous types of physical activities not present during training and involving new combinations of activated muscles. To achieve this, our benchmark also covers an evaluation setting where the model is exposed to activity types excluded from the training set. Our experiments reveal that the generalizability of existing architectures adapted for the AMGE task remains a challenge. Therefore, we also propose a new approach, TransM3E, which employs a multi-modality feature fusion mechanism between both the video transformer model and the skeleton-based graph convolution model with novel cross-modal knowledge distillation executed on multi-classification tokens. The proposed method surpasses all popular video classification models when dealing with both, previously seen and new types of physical activities. The contributed dataset and code are made publicly available at https://github.com/KPeng9510/MuscleMap.
翻译:本文针对基于视频的激活肌肉群估计(AMGE)这一新任务展开研究,旨在识别野外环境下身体活动中的活跃肌肉区域。为此,我们构建了MuscleMap数据集,包含超过1.5万个视频片段、135种不同活动及20个标注肌肉群。该数据集为在灵活环境约束下的运动科学与康复医学领域多种视频应用开辟了新视野。MuscleMap数据集基于YouTube视频构建,重点针对野外高强度间歇训练(HIIT)体力活动。为使AMGE模型适用于真实场景,必须确保模型能够泛化至训练集中未出现且涉及新激活肌肉组合的多种体力活动类型。为此,我们的基准测试还包含一种评估设置,即模型需处理训练集之外的活动类型。实验表明,现有架构在适配AMGE任务时仍面临泛化挑战。因此,我们进一步提出新方法TransM3E,该方法融合视频Transformer模型与基于骨架的图卷积模型的多模态特征融合机制,并在多分类令牌上执行新颖的跨模态知识蒸馏。所提方法在处理已见及新型体力活动时均超越所有主流视频分类模型。相关数据集与代码已在https://github.com/KPeng9510/MuscleMap 公开。