Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural language processing, a comprehensive survey for time series SSL is still missing. To fill this gap, we review current state-of-the-art SSL methods for time series data in this article. To this end, we first comprehensively review existing surveys related to SSL and time series, and then provide a new taxonomy of existing time series SSL methods. We summarize these methods into three categories: generative-based, contrastive-based, and adversarial-based. All methods can be further divided into ten subcategories. To facilitate the experiments and validation of time series SSL methods, we also summarize datasets commonly used in time series forecasting, classification, anomaly detection, and clustering tasks. Finally, we present the future directions of SSL for time series analysis.
翻译:自监督学习(SSL)近期在各类时间序列任务中展现出卓越性能。其最显著优势在于降低了对标注数据的依赖。基于预训练与微调策略,即使仅使用少量标注数据也能实现高性能。与计算机视觉和自然语言处理领域已发表的众多自监督综述相比,目前仍缺少针对时间序列SSL的全面综述。为填补这一空白,本文系统梳理了面向时间序列数据的最新自监督学习方法。为此,我们首先全面回顾了与SSL及时间序列相关的现有综述,进而提出新的时间序列SSL方法分类体系。我们将现有方法归纳为三大类别:生成式方法、对比式方法和对抗式方法,并可进一步细分为十个子类别。为便于时间序列SSL方法的实验与验证,我们还汇总了时间序列预测、分类、异常检测和聚类任务中常用的数据集。最后,我们展望了SSL在时间序列分析领域的未来发展方向。