Existing approaches for learning representations of time-series keep the temporal arrangement of the time-steps intact with the presumption that the original order is the most optimal for learning. However, non-adjacent sections of real-world time-series may have strong dependencies. Accordingly, we raise the question: Is there an alternative arrangement for time-series which could enable more effective representation learning? To address this, we propose a simple plug-and-play neural network layer called Segment, Shuffle, and Stitch (S3) designed to improve representation learning in time-series models. S3 works by creating non-overlapping segments from the original sequence and shuffling them in a learned manner that is optimal for the task at hand. It then re-attaches the shuffled segments back together and performs a learned weighted sum with the original input to capture both the newly shuffled sequence along with the original sequence. S3 is modular and can be stacked to achieve different levels of granularity, and can be added to many forms of neural architectures including CNNs or Transformers with negligible computation overhead. Through extensive experiments on several datasets and state-of-the-art baselines, we show that incorporating S3 results in significant improvements for the tasks of time-series classification, forecasting, and anomaly detection, improving performance on certain datasets by up to 68\%. We also show that S3 makes the learning more stable with a smoother training loss curve and loss landscape compared to the original baseline. The code is available at https://github.com/shivam-grover/S3-TimeSeries.
翻译:现有学习时间序列表示的方法通常保持时间步的原始时序排列不变,其预设前提是原始顺序对学习过程最为有利。然而,现实世界时间序列中非相邻区段间可能存在强关联性。基于此,我们提出一个关键问题:是否存在某种替代性的序列排列方式,能够实现更有效的表示学习?为此,我们提出一种简易的即插即用神经网络层——分段、重排与拼接层,旨在提升时间序列模型的表示学习能力。该层通过对原始序列创建非重叠区段,并以任务最优的习得方式对这些区段进行重排。随后将重排后的区段重新连接,并与原始输入进行习得权重求和,从而同时捕获新重排序列与原始序列的特征。该层采用模块化设计,可通过堆叠实现不同粒度层级的处理,并能以可忽略的计算开销集成到包括CNN或Transformer在内的多种神经网络架构中。通过在多个数据集和前沿基线模型上的大量实验,我们证明引入该层能显著提升时间序列分类、预测和异常检测任务的性能,在部分数据集上的性能提升最高达68%。研究还表明,相较于原始基线模型,该层能使学习过程更稳定,表现为更平滑的训练损失曲线和损失曲面。代码已开源:https://github.com/shivam-grover/S3-TimeSeries。