Alternators For Sequence Modeling

from arxiv, A new versatile family of sequence models that can be used for both generative modeling and supervised learning. The codebase will be made available upon publication. This paper is dedicated to Thomas Sankara

This paper introduces alternators, a novel family of non-Markovian dynamical models for sequences. An alternator features two neural networks: the observation trajectory network (OTN) and the feature trajectory network (FTN). The OTN and the FTN work in conjunction, alternating between outputting samples in the observation space and some feature space, respectively, over a cycle. The parameters of the OTN and the FTN are not time-dependent and are learned via a minimum cross-entropy criterion over the trajectories. Alternators are versatile. They can be used as dynamical latent-variable generative models or as sequence-to-sequence predictors. Alternators can uncover the latent dynamics underlying complex sequential data, accurately forecast and impute missing data, and sample new trajectories. We showcase the capabilities of alternators in three applications. We first used alternators to model the Lorenz equations, often used to describe chaotic behavior. We then applied alternators to Neuroscience, to map brain activity to physical activity. Finally, we applied alternators to Climate Science, focusing on sea-surface temperature forecasting. In all our experiments, we found alternators are stable to train, fast to sample from, yield high-quality generated samples and latent variables, and often outperform strong baselines such as Mambas, neural ODEs, and diffusion models in the domains we studied.

翻译：本文提出交替器——一种用于序列建模的新型非马尔可夫动态模型族。交替器包含两个神经网络：观测轨迹网络（OTN）与特征轨迹网络（FTN）。OTN与FTN协同工作，在一个周期内交替地分别在观测空间与特征空间中输出样本。OTN与FTN的参数不依赖于时间，通过轨迹的最小交叉熵准则进行学习。交替器具有多功能性：既可作为动态隐变量生成模型，也可作为序列到序列预测器。该模型能够揭示复杂序列数据背后的潜在动力学机制，精确预测与填补缺失数据，并生成新轨迹。我们在三个应用中展示了交替器的能力：首先将其用于描述混沌行为的洛伦兹方程建模；随后应用于神经科学领域，实现大脑活动到躯体活动的映射；最后应用于气候科学中的海表温度预测任务。在所有实验中，我们发现交替器具有训练稳定、采样快速、生成样本与隐变量质量高等特点，并在所研究领域中常优于Mamba、神经ODE及扩散模型等强基线方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日