With the rapid advancement of ubiquitous computing technology, human activity analysis based on time series data from a diverse range of sensors enables the delivery of more intelligent services. Despite the importance of exploring new activities in real-world scenarios, existing human activity recognition studies generally rely on predefined known activities and often overlook detecting new patterns (novelties) that have not been previously observed during training. Novelty detection in human activities becomes even more challenging due to (1) diversity of patterns within the same known activity, (2) shared patterns between known and new activities, and (3) differences in sensor properties of each activity dataset. We introduce CLAN, a two-tower model that leverages Contrastive Learning with diverse data Augmentation for New activity detection in sensor-based environments. CLAN simultaneously and explicitly utilizes multiple types of strongly shifted data as negative samples in contrastive learning, effectively learning invariant representations that adapt to various pattern variations within the same activity. To enhance the ability to distinguish between known and new activities that share common features, CLAN incorporates both time and frequency domains, enabling the learning of multi-faceted discriminative representations. Additionally, we design an automatic selection mechanism of data augmentation methods tailored to each dataset's properties, generating appropriate positive and negative pairs for contrastive learning. Comprehensive experiments on real-world datasets show that CLAN achieves a 9.24% improvement in AUROC compared to the best-performing baseline model.
翻译:随着普适计算技术的快速发展,基于多种传感器时间序列数据的人类活动分析能够提供更智能的服务。尽管在实际场景中探索新活动具有重要意义,现有的人类活动识别研究通常依赖于预定义的已知活动,且往往忽视检测训练期间未曾观察到的新模式(新颖性)。人类活动中的新颖性检测因以下原因而更具挑战性:(1) 同一已知活动内模式的多样性,(2) 已知活动与新活动之间共享的模式,以及 (3) 各活动数据集中传感器特性的差异。我们提出了CLAN,一种双塔模型,利用对比学习和多样化数据增强,用于基于传感器的环境中的新活动检测。CLAN在对比学习中同时且显式地利用多种类型的强偏移数据作为负样本,从而有效学习能够适应同一活动内各种模式变化的不变表示。为了增强区分具有共同特征的已知活动与新活动的能力,CLAN结合了时域和频域,从而能够学习多方面的判别性表示。此外,我们设计了一种针对每个数据集特性定制的数据增强方法自动选择机制,为对比学习生成合适的正负样本对。在真实世界数据集上的综合实验表明,与性能最佳的基线模型相比,CLAN在AUROC指标上实现了9.24%的提升。