Long-term time series forecasting using transformers is hampered by the quadratic complexity of self-attention and the rigidity of uniform patching, which may be misaligned with the data's semantic structure. In this paper, we introduce the \textit{B-Spline Adaptive Tokenizer (BSAT)}, a novel, parameter-free method that adaptively segments a time series by fitting it with B-splines. BSAT algorithmically places tokens in high-curvature regions and represents each variable-length basis function as a fixed-size token, composed of its coefficient and position. Further, we propose a hybrid positional encoding that combines a additive learnable positional encoding with Rotary Positional Embedding featuring a layer-wise learnable base: L-RoPE. This allows each layer to attend to different temporal dependencies. Our experiments on several public benchmarks show that our model is competitive with strong performance at high compression rates. This makes it particularly well-suited for use cases with strong memory constraints.
翻译:基于Transformer的长期时间序列预测面临两大挑战:自注意力机制的二次复杂度以及均匀分块的刚性,后者可能与数据的语义结构不匹配。本文提出了一种新颖的无参数方法——B样条自适应分词器,它通过B样条拟合自适应地分割时间序列。BSAT算法将标记点置于高曲率区域,并将每个变长基函数表示为一个固定大小的标记,该标记由其系数和位置组成。此外,我们提出了一种混合位置编码方法,该方法将可学习的加性位置编码与具有层间可学习基的旋转位置嵌入相结合:L-RoPE。这使得每一层能够关注不同的时间依赖关系。我们在多个公开基准数据集上的实验表明,该模型在高压缩率下仍能保持强劲的竞争性能,因此特别适用于内存约束严格的应用场景。