Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural language processing, a comprehensive survey for time series SSL is still missing. To fill this gap, we review current state-of-the-art SSL methods for time series data in this article. To this end, we first comprehensively review existing surveys related to SSL and time series, and then provide a new taxonomy of existing time series SSL methods by summarizing them from three perspectives: generative-based, contrastive-based, and adversarial-based. These methods are further divided into ten subcategories with detailed reviews and discussions about their key intuitions, main frameworks, advantages and disadvantages. To facilitate the experiments and validation of time series SSL methods, we also summarize datasets commonly used in time series forecasting, classification, anomaly detection, and clustering tasks. Finally, we present the future directions of SSL for time series analysis.
翻译:自监督学习(SSL)近期在各类时间序列任务中取得了显著性能。其最突出的优势在于降低了对标注数据的依赖。基于预训练与微调策略,即使仅有少量标注数据也能获得高性能。相较于计算机视觉和自然语言处理领域已发表的多篇自监督综述,针对时间序列自监督学习的系统性综述仍然缺失。为填补这一空白,本文对当前时间序列数据的最先进自监督学习方法进行了系统综述。具体而言,我们首先全面梳理了与自监督学习和时间序列相关的现有综述,进而从生成式、对比式和对抗式三个视角对现有时间序列自监督学习方法提出新的分类体系。这些方法可进一步划分为十个子类别,并对其核心思路、主要框架、优势与局限性进行了详细评述与讨论。为便于时间序列自监督学习方法的实验与验证,我们还汇总了时间序列预测、分类、异常检测及聚类任务中常用的数据集。最后,我们指出了时间序列分析中自监督学习的未来发展方向。