This paper explores how to enhance existing masked time-series modeling by randomly dropping sub-sequence level patches of time series. On this basis, a simple yet effective method named DropPatch is proposed, which has two remarkable advantages: 1) It improves the pre-training efficiency by a square-level advantage; 2) It provides additional advantages for modeling in scenarios such as in-domain, cross-domain, few-shot learning and cold start. This paper conducts comprehensive experiments to verify the effectiveness of the method and analyze its internal mechanism. Empirically, DropPatch strengthens the attention mechanism, reduces information redundancy and serves as an efficient means of data augmentation. Theoretically, it is proved that DropPatch slows down the rate at which the Transformer representations collapse into the rank-1 linear subspace by randomly dropping patches, thus optimizing the quality of the learned representations
翻译:本文探讨了如何通过随机丢弃时间序列的子序列级片段来增强现有的掩码时间序列建模。在此基础上,提出了一种简单而有效的方法DropPatch,该方法具有两个显著优势:1) 通过平方级优势提升预训练效率;2) 为领域内、跨领域、少样本学习和冷启动等场景下的建模提供额外优势。本文通过全面实验验证了该方法的有效性并分析了其内在机制。经验表明,DropPatch能够强化注意力机制、降低信息冗余,并作为一种高效的数据增强手段。理论上证明了DropPatch通过随机丢弃片段,能够减缓Transformer表示坍缩至秩-1线性子空间的速度,从而优化所学表示的质量。