Diffusion Model-based Contrastive Learning for Human Activity Recognition

WiFi Channel State Information (CSI)-based activity recognition has sparked numerous studies due to its widespread availability and privacy protection. However, when applied in practical applications, general CSI-based recognition models may face challenges related to the limited generalization capability, since individuals with different behavior habits will cause various fluctuations in CSI data and it is difficult to gather enough training data to cover all kinds of motion habits. To tackle this problem, we design a diffusion model-based Contrastive Learning framework for human Activity Recognition (CLAR) using WiFi CSI. On the basis of the contrastive learning framework, we primarily introduce two components for CLAR to enhance CSI-based activity recognition. To generate diverse augmented data and complement limited training data, we propose a diffusion model-based time series-specific augmentation model. In contrast to typical diffusion models that directly apply conditions to the generative process, potentially resulting in distorted CSI data, our tailored model dissects these condition into the high-frequency and low-frequency components, and then applies these conditions to the generative process with varying weights. This can alleviate data distortion and yield high-quality augmented data. To efficiently capture the difference of the sample importance, we present an adaptive weight algorithm. Different from typical contrastive learning methods which equally consider all the training samples, this algorithm adaptively adjusts the weights of positive sample pairs for learning better data representations. The experiments suggest that CLAR achieves significant gains compared to state-of-the-art methods.

翻译：基于WiFi信道状态信息（CSI）的活动识别因其广泛可用性和隐私保护特性而引发了大量研究。然而，当应用于实际场景时，通用的基于CSI的识别模型可能面临泛化能力有限的挑战，因为具有不同行为习惯的个体会导致CSI数据的各种波动，且难以收集足够的训练数据以覆盖所有类型的运动习惯。为解决这一问题，我们设计了一种基于扩散模型的对比学习框架用于人类活动识别（CLAR），该框架利用WiFi CSI数据。在对比学习框架的基础上，我们主要为CLAR引入了两个组件以增强基于CSI的活动识别性能。为了生成多样化的增强数据并补充有限的训练数据，我们提出了一种基于扩散模型的时序数据专用增强模型。与典型的扩散模型直接将条件应用于生成过程（可能导致CSI数据失真）不同，我们定制的模型将这些条件分解为高频和低频分量，然后以不同权重将这些条件应用于生成过程。这可以减轻数据失真并产生高质量的增强数据。为了有效捕捉样本重要性的差异，我们提出了一种自适应权重算法。与典型的对比学习方法平等对待所有训练样本不同，该算法自适应地调整正样本对的权重，以学习更好的数据表示。实验表明，与现有最先进方法相比，CLAR取得了显著的性能提升。