The success of self-supervised contrastive learning hinges on identifying positive data pairs that, when pushed together in embedding space, encode useful information for subsequent downstream tasks. However, in time-series, this is challenging because creating positive pairs via augmentations may break the original semantic meaning. We hypothesize that if we can retrieve information from one subsequence to successfully reconstruct another subsequence, then they should form a positive pair. Harnessing this intuition, we introduce our novel approach: REtrieval-BAsed Reconstruction (REBAR) contrastive learning. First, we utilize a convolutional cross-attention architecture to calculate the REBAR error between two different time-series. Then, through validation experiments, we show that the REBAR error is a predictor of mutual class membership, justifying its usage as a positive/negative labeler. Finally, once integrated into a contrastive learning framework, our REBAR method can learn an embedding that achieves state-of-the-art performance on downstream tasks across various modalities.
翻译:自监督对比学习的成功依赖于识别正样本数据对——当这些样本在嵌入空间中被拉近时,能够为后续下游任务编码有用信息。然而,在时间序列中这一目标颇具挑战性,因为通过数据增强创建正样本对可能破坏原始语义含义。我们假设:若能从某个子序列检索信息成功重构另一个子序列,则两者应构成正样本对。基于这一直觉,我们提出全新方法:基于检索的重构(REBAR)对比学习。首先,利用卷积交叉注意力架构计算两个不同时间序列间的REBAR误差。其次,通过验证实验证明REBAR误差可作为类别隶属关系的预测指标,从而验证其作为正/负样本标注器的合理性。最终,当集成到对比学习框架后,我们的REBAR方法能够学习到嵌入表示,在跨多种模态的下游任务中均取得最先进性能。