From human physiology to environmental evolution, important processes in nature often exhibit meaningful and strong periodic or quasi-periodic changes. Due to their inherent label scarcity, learning useful representations for periodic tasks with limited or no supervision is of great benefit. Yet, existing self-supervised learning (SSL) methods overlook the intrinsic periodicity in data, and fail to learn representations that capture periodic or frequency attributes. In this paper, we present SimPer, a simple contrastive SSL regime for learning periodic information in data. To exploit the periodic inductive bias, SimPer introduces customized augmentations, feature similarity measures, and a generalized contrastive loss for learning efficient and robust periodic representations. Extensive experiments on common real-world tasks in human behavior analysis, environmental sensing, and healthcare domains verify the superior performance of SimPer compared to state-of-the-art SSL methods, highlighting its intriguing properties including better data efficiency, robustness to spurious correlations, and generalization to distribution shifts. Code and data are available at: https://github.com/YyzHarry/SimPer.
翻译:摘要:从人类生理变化到环境演化,自然界的许多重要过程往往表现出显著且强烈的周期性或准周期性变化。由于此类任务固有地缺乏标注数据,在有限或无监督条件下学习周期性任务的有效表征具有重要价值。然而,现有自监督学习方法忽视了数据中固有的周期性特征,无法学习捕捉周期或频率属性的表征。本文提出SimPer——一种用于学习数据中周期性信息的简单对比自监督学习框架。为利用周期性归纳偏置,SimPer引入定制化数据增强、特征相似性度量及广义对比损失函数,以学习高效且鲁棒的周期性表征。在人体行为分析、环境感知及医疗健康等常见真实世界任务上的大量实验表明,相较于现有最先进的自监督学习方法,SimPer展现出更优性能,其优异特性包括更高的数据效率、对伪相关性的鲁棒性以及面向分布偏移的泛化能力。代码与数据开源地址:https://github.com/YyzHarry/SimPer。