Masked reconstruction serves as a fundamental pretext task for self-supervised learning, enabling the model to enhance its feature extraction capabilities by reconstructing the masked segments from extensive unlabeled data. In human activity recognition, this pretext task employed a masking strategy centered on the time dimension. However, this masking strategy fails to fully exploit the inherent characteristics of wearable sensor data and overlooks the inter-channel information coupling, thereby limiting its potential as a powerful pretext task. To address these limitations, we propose a novel masking strategy called Channel Masking. It involves masking the sensor data along the channel dimension, thereby compelling the encoder to extract channel-related features while performing the masked reconstruction task. Moreover, Channel Masking can be seamlessly integrated with masking strategies along the time dimension, thereby motivating the self-supervised model to undertake the masked reconstruction task in both the time and channel dimensions. Integrated masking strategies are named Time-Channel Masking and Span-Channel Masking. Finally, we optimize the reconstruction loss function to incorporate the reconstruction loss in both the time and channel dimensions. We evaluate proposed masking strategies on three public datasets, and experimental results show that the proposed strategies outperform prior strategies in both self-supervised and semi-supervised scenarios.
翻译:遮蔽重建作为自监督学习的基础预训练任务,通过从大量无标签数据中重建被遮蔽片段,使模型能够增强其特征提取能力。在人体活动识别中,该预训练任务采用了以时间维度为核心的遮蔽策略。然而,这种遮蔽策略未能充分利用可穿戴传感器数据的固有特性,且忽视了通道间的信息耦合,从而限制了其作为强效预训练任务的潜力。为应对这些局限,我们提出了一种名为“通道遮蔽”的新型遮蔽策略。该策略沿通道维度对传感器数据进行遮蔽,从而迫使编码器在执行遮蔽重建任务时提取与通道相关的特征。此外,通道遮蔽可与时间维度的遮蔽策略无缝集成,从而激励自监督模型在时间与通道两个维度上执行遮蔽重建任务。集成的遮蔽策略被命名为“时间-通道遮蔽”与“跨度-通道遮蔽”。最终,我们优化了重建损失函数,使其同时包含时间与通道维度的重建损失。我们在三个公开数据集上评估了所提出的遮蔽策略,实验结果表明,所提策略在自监督与半监督场景下均优于现有策略。