Rough Transformers: Lightweight Continuous-Time Sequence Modelling with Path Signatures

Time-series data in real-world settings typically exhibit long-range dependencies and are observed at non-uniform intervals. In these settings, traditional sequence-based recurrent models struggle. To overcome this, researchers often replace recurrent architectures with Neural ODE-based models to account for irregularly sampled data and use Transformer-based architectures to account for long-range dependencies. Despite the success of these two approaches, both incur very high computational costs for input sequences of even moderate length. To address this challenge, we introduce the Rough Transformer, a variation of the Transformer model that operates on continuous-time representations of input sequences and incurs significantly lower computational costs. In particular, we propose \textit{multi-view signature attention}, which uses path signatures to augment vanilla attention and to capture both local and global (multi-scale) dependencies in the input data, while remaining robust to changes in the sequence length and sampling frequency and yielding improved spatial processing. We find that, on a variety of time-series-related tasks, Rough Transformers consistently outperform their vanilla attention counterparts while obtaining the representational benefits of Neural ODE-based models, all at a fraction of the computational time and memory resources.

翻译：现实世界中的时间序列数据通常表现出长程依赖性，且观测时间间隔不均匀。在此类场景下，传统的基于序列的循环模型往往难以应对。为解决这一问题，研究者通常采用基于神经ODE的模型来处理非均匀采样数据，并借助基于Transformer的架构来捕捉长程依赖关系。尽管这两种方法均取得了成功，但即使对于中等长度的输入序列，它们都会产生极高的计算成本。为应对这一挑战，我们提出了粗糙Transformer——一种基于输入序列连续时间表示的Transformer变体，其计算成本显著降低。具体而言，我们提出了*多视图签名注意力*机制，该机制利用路径签名增强原始注意力，从而捕捉输入数据中局部与全局（多尺度）的依赖关系，同时保持对序列长度和采样频率变化的鲁棒性，并实现更优的空间处理能力。实验表明，在多种时间序列相关任务中，粗糙Transformer在获得基于神经ODE模型的表征优势的同时，始终优于原始注意力机制的基线模型，且仅需消耗其计算时间与内存资源的极小部分。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日