Rough Transformers for Continuous and Efficient Time-Series Modelling

Time-series data in real-world medical settings typically exhibit long-range dependencies and are observed at non-uniform intervals. In such contexts, traditional sequence-based recurrent models struggle. To overcome this, researchers replace recurrent architectures with Neural ODE-based models to model irregularly sampled data and use Transformer-based architectures to account for long-range dependencies. Despite the success of these two approaches, both incur very high computational costs for input sequences of moderate lengths and greater. To mitigate this, we introduce the Rough Transformer, a variation of the Transformer model which operates on continuous-time representations of input sequences and incurs significantly reduced computational costs, critical for addressing long-range dependencies common in medical contexts. In particular, we propose multi-view signature attention, which uses path signatures to augment vanilla attention and to capture both local and global dependencies in input data, while remaining robust to changes in the sequence length and sampling frequency. We find that Rough Transformers consistently outperform their vanilla attention counterparts while obtaining the benefits of Neural ODE-based models using a fraction of the computational time and memory resources on synthetic and real-world time-series tasks.

翻译：真实医疗环境中的时间序列数据通常表现出长程依赖关系，且观测时间间隔不规则。在此类情境下，传统基于序列的循环模型难以胜任。为克服这一局限，研究人员用基于神经常微分方程的模型替代循环架构来处理非均匀采样数据，并采用基于Transformer的架构来建模长程依赖。尽管这两种方法取得了成功，但对于中等长度及更长的输入序列而言，两者都会带来极高的计算成本。为缓解这一问题，我们引入了粗鲁变换器（Rough Transformer）——一种Transformer模型的变体，它对输入序列的连续时间表示进行操作，显著降低了计算成本，这对解决医疗场景中常见的长期依赖问题至关重要。具体而言，我们提出了多视角签名注意力机制，该机制利用路径签名增强标准注意力，既能捕获输入数据中的局部依赖关系也能捕获全局依赖关系，同时对序列长度和采样频率的变化保持鲁棒性。我们发现，在合成时间序列任务和真实世界时间序列任务中，粗鲁变换器在计算时间和内存资源仅消耗一小部分的情况下，持续优于其标准注意力对应模型，同时获得了基于神经常微分方程模型的优势。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日