Despite recent advancements in Self-Supervised Learning (SSL) for time series analysis, a noticeable gap persists between the anticipated achievements and actual performance. While these methods have demonstrated formidable generalization capabilities with minimal labels in various domains, their effectiveness in distinguishing between different classes based on a limited number of annotated records is notably lacking. Our hypothesis attributes this bottleneck to the prevalent use of Contrastive Learning, a shared training objective in previous state-of-the-art (SOTA) methods. By mandating distinctiveness between representations for negative pairs drawn from separate records, this approach compels the model to encode unique record-based patterns but simultaneously neglects changes occurring across the entire record. To overcome this challenge, we introduce Distilled Embedding for Almost-Periodic Time Series (DEAPS) in this paper, offering a non-contrastive method tailored for quasiperiodic time series, such as electrocardiogram (ECG) data. By avoiding the use of negative pairs, we not only mitigate the model's blindness to temporal changes but also enable the integration of a "Gradual Loss (Lgra)" function. This function guides the model to effectively capture dynamic patterns evolving throughout the record. The outcomes are promising, as DEAPS demonstrates a notable improvement of +10% over existing SOTA methods when just a few annotated records are presented to fit a Machine Learning (ML) model based on the learned representation.
翻译:尽管自监督学习在时间序列分析领域取得了最新进展,但预期成果与实际性能之间仍存在明显差距。虽然这些方法在多个领域中已展现出仅需少量标签即可实现的强大泛化能力,但其在基于有限标注记录区分不同类别方面的有效性明显不足。我们的假设将这一瓶颈归因于对比学习的普遍使用——这是先前最先进方法共有的训练目标。该方法强制要求来自不同记录的负样本对表征之间具有区分性,从而迫使模型编码基于记录的独特模式,但同时忽略了整个记录中发生的变化。为克服这一挑战,本文提出了针对准周期时间序列的蒸馏嵌入方法,为心电图等准周期时间序列提供了一种非对比式解决方案。通过避免使用负样本对,我们不仅缓解了模型对时序变化的盲视性,还实现了"渐进损失函数"的集成。该函数引导模型有效捕捉记录中演化的动态模式。实验结果表明,当仅使用少量标注记录来拟合基于所学表征的机器学习模型时,DEAPS相较于现有最先进方法实现了超过10%的显著性能提升。