Various contrastive learning approaches have been proposed in recent years and achieve significant empirical success. While effective and prevalent, contrastive learning has been less explored for time series data. A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples, such that an encoder can be trained to learn robust and discriminative representations. Unlike image and language domains where ``desired'' augmented samples can be generated with the rule of thumb guided by prefabricated human priors, the ad-hoc manual selection of time series augmentations is hindered by their diverse and human-unrecognizable temporal structures. How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question. In this work, we address the problem by encouraging both high \textit{fidelity} and \textit{variety} based upon information theory. A theoretical analysis leads to the criteria for selecting feasible data augmentations. On top of that, we propose a new contrastive learning approach with information-aware augmentations, InfoTS, that adaptively selects optimal augmentations for time series representation learning. Experiments on various datasets show highly competitive performance with up to 12.0\% reduction in MSE on forecasting tasks and up to 3.7\% relative improvement in accuracy on classification tasks over the leading baselines.
翻译:近年来,多种对比学习方法被提出并在实证中取得了显著成功。尽管对比学习有效且应用广泛,但在时间序列数据上的研究仍相对不足。对比学习的关键在于选择合适的增强方法,通过引入先验知识构建可行的正样本,从而训练编码器学习鲁棒且具有判别性的表示。与图像和语言领域不同——在这些领域中,可依托预制人类先验的启发式规则生成“理想”增强样本——时间序列增强的手动选择因其多样性和人类难以识别的时序结构而受到制约。如何针对特定对比学习任务和数据集,找到有意义的理想时间序列增强方法仍是一个开放问题。本研究基于信息论,通过鼓励高保真度与多样性来解决该问题。理论分析得出了选择可行数据增强的准则。在此基础上,我们提出了一种新的对比学习方法——信息感知增强(InfoTS),它能够自适应地选择最优增强方法用于时间序列表示学习。在多个数据集上的实验表明,本方法在预测任务中均方误差最高降低12.0%,在分类任务中准确率相对领先基线提升3.7%,展现出极具竞争力的性能。