Patch-Level Tokenization with CNN Encoders and Attention for Improved Transformer Time-Series Forecasting - 专知论文

会员服务 ·

0

序列 · 时间序列 · 序列预测 · 表示 · 多元时间序列 ·

Patch-Level Tokenization with CNN Encoders and Attention for Improved Transformer Time-Series Forecasting

翻译：基于CNN编码器与注意力机制的补丁级标记化方法用于改进Transformer时间序列预测

Saurish Nagrath,Saroj Kumar Panigrahy

from arxiv, 6 pages, 2 figures, 3 tables

Transformer-based models have shown strong performance in time-series forecasting by leveraging self-attention to model long-range temporal dependencies. However, their effectiveness depends critically on the quality and structure of input representations derived from raw multivariate time-series data, particularly as sequence length and data scale increase. This paper proposes a two-stage forecasting framework that explicitly separates local temporal representation learning from global dependency modelling. In the proposed approach, a convolutional neural network operates on fixed-length temporal patches to extract short-range temporal dynamics and non-linear feature interactions, producing compact patch-level token embeddings. Token-level self-attention is applied during representation learning to refine these embeddings, after which a Transformer encoder models inter-patch temporal dependencies to generate forecasts. The method is evaluated on a synthetic multivariate time-series dataset with controlled static and dynamic factors, using an extended sequence length and a larger number of samples. Experimental results demonstrate that the proposed framework consistently outperforms a convolutional baseline under increased temporal context and remains competitive with a strong patch-based Transformer model. These findings indicate that structured patch-level tokenization provides a scalable and effective representation for multivariate time-series forecasting, particularly when longer input sequences are considered.

翻译：基于Transformer的模型通过利用自注意力机制建模长程时间依赖关系，在时间序列预测中表现出强大性能。然而，其有效性关键取决于从原始多元时间序列数据导出的输入表示的质量与结构，尤其是在序列长度和数据规模增加时。本文提出一种两阶段预测框架，将局部时间表示学习与全局依赖建模显式分离。在所提方法中，卷积神经网络在固定长度的时间补丁上运行，以提取短程时间动态特征和非线性特征交互，生成紧凑的补丁级标记嵌入。在表示学习阶段应用标记级自注意力以优化这些嵌入，随后通过Transformer编码器建模补丁间的时间依赖关系以生成预测。该方法在具有受控静态与动态因素的合成多元时间序列数据集上进行评估，采用扩展的序列长度和更大规模的样本。实验结果表明，所提框架在增强时间上下文条件下持续优于卷积基线模型，并与基于补丁的强Transformer模型保持竞争力。这些发现表明，结构化的补丁级标记化为多元时间序列预测提供了可扩展且有效的表示方法，尤其在处理更长输入序列时具有显著优势。

0

相关内容

数学上，序列是被排成一列的对象（或事件）；这样每个元素不是在其他元素之前，就是在其他元素之后。这里，元素之间的顺序非常重要。

【ICML2022】Transformer是元强化学习器

【ICML2022】Transformer是元强化学习器

专知会员服务

56+阅读 · 2022年6月15日

【ICML2022】FEDformer:用于长期序列预测的频率增强分解Transformer

【ICML2022】FEDformer:用于长期序列预测的频率增强分解Transformer

专知会员服务

26+阅读 · 2022年5月19日

Transformers如何进行时序分析？Rowan大学最新《Transformers时序分析》综述

Transformers如何进行时序分析？Rowan大学最新《Transformers时序分析》综述

专知会员服务

86+阅读 · 2022年5月5日

阿里巴巴发布最新《时间序列Transformer建模》综述论文

阿里巴巴发布最新《时间序列Transformer建模》综述论文

专知会员服务

137+阅读 · 2022年2月16日

【NeurIPS 2021】流形上的注意力机制：规范等变的Transformer

【NeurIPS 2021】流形上的注意力机制：规范等变的Transformer

专知会员服务

30+阅读 · 2021年12月2日

【ICML2021】PoolingFormer：具有池化注意力机制的长序列输入模型

专知会员服务

35+阅读 · 2021年7月25日

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

专知会员服务

62+阅读 · 2021年2月6日

时间序列预测方法综述

专知会员服务

237+阅读 · 2020年12月15日

【KDD2020】Connecting the Dots: 基于图神经网络的多元时间序列预测

【KDD2020】Connecting the Dots: 基于图神经网络的多元时间序列预测

专知会员服务

130+阅读 · 2020年7月6日

Google AI博客解读论文《Reformer: The Efficient Transformer》，百万量级注意力机制

Google AI博客解读论文《Reformer: The Efficient Transformer》，百万量级注意力机制

专知会员服务

71+阅读 · 2020年1月17日

AAAI21最佳论文Informer：效果远超Transformer的长序列预测神器！

AAAI21最佳论文Informer：效果远超Transformer的长序列预测神器！

AINLP

10+阅读 · 2021年2月6日

时空序列预测方法综述

时空序列预测方法综述

专知

22+阅读 · 2020年10月19日

无所不能的Self-Attention！洛桑理工ICLR2020论文验证「自注意力可以表达任何CNN卷积滤波层」

无所不能的Self-Attention！洛桑理工ICLR2020论文验证「自注意力可以表达任何CNN卷积滤波层」

专知

24+阅读 · 2020年1月12日

深度学习的下一步：Transformer和注意力机制

深度学习的下一步：Transformer和注意力机制

云头条

56+阅读 · 2019年9月14日

谷歌NIPS论文Transformer模型解读：只要Attention就够了

谷歌NIPS论文Transformer模型解读：只要Attention就够了

AI100

14+阅读 · 2019年9月9日

基于LSTM深层神经网络的时间序列预测

基于LSTM深层神经网络的时间序列预测

论智

22+阅读 · 2018年9月4日

基于 Keras 用 LSTM 网络做时间序列预测

基于 Keras 用 LSTM 网络做时间序列预测

R语言中文社区

21+阅读 · 2018年8月6日

基于 Keras 用深度学习预测时间序列

基于 Keras 用深度学习预测时间序列

R语言中文社区

23+阅读 · 2018年7月27日

教你搭建多变量时间序列预测模型LSTM（附代码、数据集）

教你搭建多变量时间序列预测模型LSTM（附代码、数据集）

数据派THU

59+阅读 · 2017年11月6日

如何在Python中用LSTM网络进行时间序列预测

如何在Python中用LSTM网络进行时间序列预测

AI100

17+阅读 · 2017年8月5日

基于时变回声状态网的光伏发电在线短期预测方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度信念网络的高光谱遥感影像变化检测方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于深度卷积神经网络的多源遥感图像时空融合方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于调度协议的离散系统网络控制：时滞系统方法

国家自然科学基金

0+阅读 · 2015年12月31日

稀疏信号驱动的时间序列信号盲分离优化模型及算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向安全关键系统的时间可预测多核代码生成方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

高维时间序列的降维与建模

国家自然科学基金

23+阅读 · 2015年12月31日

稳健随机均值模型在时空数据分析中的应用

国家自然科学基金

1+阅读 · 2014年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

时间序列异常值探测的Bayes方法及其在GNSS动态数据处理中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

Statistical benchmarking of transformer models in low signal-to-noise time-series forecasting

Arxiv

0+阅读 · 2月10日

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

Arxiv

0+阅读 · 2月9日

Revisiting the Generic Transformer: Deconstructing a Strong Baseline for Time Series Foundation Models

Arxiv

0+阅读 · 2月6日

ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning

Arxiv

0+阅读 · 2月5日

Linear Transformers as VAR Models: Aligning Autoregressive Attention Mechanisms with Autoregressive Forecasting

Arxiv

0+阅读 · 2月5日

An Optimization Method for Autoregressive Time Series Forecasting

Arxiv

0+阅读 · 2月2日

AverageTime: Enhance Long-Term Time Series Forecasting with Simple Averaging

Arxiv

0+阅读 · 1月31日

Entropy Guided Dynamic Patch Segmentation for Time Series Transformers

Arxiv

0+阅读 · 1月29日

PatchFormer: A Patch-Based Time Series Foundation Model with Hierarchical Masked Reconstruction and Cross-Domain Transfer Learning for Zero-Shot Multi-Horizon Forecasting

Arxiv

0+阅读 · 1月28日

ScatterFusion: A Hierarchical Scattering Transform Framework for Enhanced Time Series Forecasting

Arxiv

0+阅读 · 1月28日

VIP会员

文章信息

相关主题

多元时间序列

最新内容

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

3+阅读 · 今天8:18

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

3+阅读 · 今天7:39

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

7+阅读 · 今天7:33

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

3+阅读 · 今天7:28

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

4+阅读 · 今天7:14

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

18+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

7+阅读 · 6月15日

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

专知会员服务

8+阅读 · 6月15日

《网络空间兵棋推演：挑战、局限性与混合路径》报告

《网络空间兵棋推演：挑战、局限性与混合路径》报告

专知会员服务

8+阅读 · 6月15日

《离线语言支持系统：面向空战战术决策》

《离线语言支持系统：面向空战战术决策》

专知会员服务

8+阅读 · 6月15日

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

专知会员服务

6+阅读 · 6月15日

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

专知会员服务

6+阅读 · 6月14日

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

6+阅读 · 6月14日

俄乌战场地面机器人如何改写战争规则

俄乌战场地面机器人如何改写战争规则

专知会员服务

9+阅读 · 6月14日

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

专知会员服务

13+阅读 · 6月14日

相关VIP内容

【ICML2022】Transformer是元强化学习器

【ICML2022】Transformer是元强化学习器

专知会员服务

56+阅读 · 2022年6月15日

【ICML2022】FEDformer:用于长期序列预测的频率增强分解Transformer

【ICML2022】FEDformer:用于长期序列预测的频率增强分解Transformer

专知会员服务

26+阅读 · 2022年5月19日

Transformers如何进行时序分析？Rowan大学最新《Transformers时序分析》综述

Transformers如何进行时序分析？Rowan大学最新《Transformers时序分析》综述

专知会员服务

86+阅读 · 2022年5月5日

阿里巴巴发布最新《时间序列Transformer建模》综述论文

阿里巴巴发布最新《时间序列Transformer建模》综述论文

专知会员服务

137+阅读 · 2022年2月16日

【NeurIPS 2021】流形上的注意力机制：规范等变的Transformer

【NeurIPS 2021】流形上的注意力机制：规范等变的Transformer

专知会员服务

30+阅读 · 2021年12月2日

【ICML2021】PoolingFormer：具有池化注意力机制的长序列输入模型

专知会员服务

35+阅读 · 2021年7月25日

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

专知会员服务

62+阅读 · 2021年2月6日

时间序列预测方法综述

专知会员服务

237+阅读 · 2020年12月15日

【KDD2020】Connecting the Dots: 基于图神经网络的多元时间序列预测

【KDD2020】Connecting the Dots: 基于图神经网络的多元时间序列预测

专知会员服务

130+阅读 · 2020年7月6日

Google AI博客解读论文《Reformer: The Efficient Transformer》，百万量级注意力机制

Google AI博客解读论文《Reformer: The Efficient Transformer》，百万量级注意力机制

专知会员服务

71+阅读 · 2020年1月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《通过小型无人机系统将情报能力“作战化”》

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

《通用大语言模型：无人机指挥与控制接口》最新40页

相关资讯

AAAI21最佳论文Informer：效果远超Transformer的长序列预测神器！

AAAI21最佳论文Informer：效果远超Transformer的长序列预测神器！

AINLP

10+阅读 · 2021年2月6日

时空序列预测方法综述

时空序列预测方法综述

专知

22+阅读 · 2020年10月19日

无所不能的Self-Attention！洛桑理工ICLR2020论文验证「自注意力可以表达任何CNN卷积滤波层」

无所不能的Self-Attention！洛桑理工ICLR2020论文验证「自注意力可以表达任何CNN卷积滤波层」

专知

24+阅读 · 2020年1月12日

深度学习的下一步：Transformer和注意力机制

深度学习的下一步：Transformer和注意力机制

云头条

56+阅读 · 2019年9月14日

谷歌NIPS论文Transformer模型解读：只要Attention就够了

谷歌NIPS论文Transformer模型解读：只要Attention就够了

AI100

14+阅读 · 2019年9月9日

基于LSTM深层神经网络的时间序列预测

基于LSTM深层神经网络的时间序列预测

论智

22+阅读 · 2018年9月4日

基于 Keras 用 LSTM 网络做时间序列预测

基于 Keras 用 LSTM 网络做时间序列预测

R语言中文社区

21+阅读 · 2018年8月6日

基于 Keras 用深度学习预测时间序列

基于 Keras 用深度学习预测时间序列

R语言中文社区

23+阅读 · 2018年7月27日

教你搭建多变量时间序列预测模型LSTM（附代码、数据集）

教你搭建多变量时间序列预测模型LSTM（附代码、数据集）

数据派THU

59+阅读 · 2017年11月6日

如何在Python中用LSTM网络进行时间序列预测

如何在Python中用LSTM网络进行时间序列预测

AI100

17+阅读 · 2017年8月5日

相关论文

Statistical benchmarking of transformer models in low signal-to-noise time-series forecasting

Arxiv

0+阅读 · 2月10日

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

Arxiv

0+阅读 · 2月9日

Revisiting the Generic Transformer: Deconstructing a Strong Baseline for Time Series Foundation Models

Arxiv

0+阅读 · 2月6日

ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning

Arxiv

0+阅读 · 2月5日

Linear Transformers as VAR Models: Aligning Autoregressive Attention Mechanisms with Autoregressive Forecasting

Arxiv

0+阅读 · 2月5日

An Optimization Method for Autoregressive Time Series Forecasting

Arxiv

0+阅读 · 2月2日

AverageTime: Enhance Long-Term Time Series Forecasting with Simple Averaging

Arxiv

0+阅读 · 1月31日

Entropy Guided Dynamic Patch Segmentation for Time Series Transformers

Arxiv

0+阅读 · 1月29日

PatchFormer: A Patch-Based Time Series Foundation Model with Hierarchical Masked Reconstruction and Cross-Domain Transfer Learning for Zero-Shot Multi-Horizon Forecasting

Arxiv

0+阅读 · 1月28日

ScatterFusion: A Hierarchical Scattering Transform Framework for Enhanced Time Series Forecasting

Arxiv

0+阅读 · 1月28日

相关基金

基于时变回声状态网的光伏发电在线短期预测方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度信念网络的高光谱遥感影像变化检测方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于深度卷积神经网络的多源遥感图像时空融合方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于调度协议的离散系统网络控制：时滞系统方法

国家自然科学基金

0+阅读 · 2015年12月31日

稀疏信号驱动的时间序列信号盲分离优化模型及算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向安全关键系统的时间可预测多核代码生成方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

高维时间序列的降维与建模

国家自然科学基金

23+阅读 · 2015年12月31日

稳健随机均值模型在时空数据分析中的应用

国家自然科学基金

1+阅读 · 2014年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

时间序列异常值探测的Bayes方法及其在GNSS动态数据处理中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员