Mixture-of-Linear-Experts for Long-term Time Series Forecasting

Long-term time series forecasting (LTSF) aims to predict future values of a time series given the past values. The current state-of-the-art (SOTA) on this problem is attained in some cases by linear-centric models, which primarily feature a linear mapping layer. However, due to their inherent simplicity, they are not able to adapt their prediction rules to periodic changes in time series patterns. To address this challenge, we propose a Mixture-of-Experts-style augmentation for linear-centric models and propose Mixture-of-Linear-Experts (MoLE). Instead of training a single model, MoLE trains multiple linear-centric models (i.e., experts) and a router model that weighs and mixes their outputs. While the entire framework is trained end-to-end, each expert learns to specialize in a specific temporal pattern, and the router model learns to compose the experts adaptively. Experiments show that MoLE reduces forecasting error of linear-centric models, including DLinear, RLinear, and RMLP, in over 78% of the datasets and settings we evaluated. By using MoLE existing linear-centric models can achieve SOTA LTSF results in 68% of the experiments that PatchTST reports and we compare to, whereas existing single-head linear-centric models achieve SOTA results in only 25% of cases. Additionally, MoLE models achieve SOTA in all settings for the newly released Weather2K datasets.

翻译：长期时间序列预测（LTSF）旨在根据时间序列的过去值预测其未来值。当前该问题的现有最优（SOTA）方法在某些情况下由线性中心模型实现，这类模型主要包含一个线性映射层。然而，由于其固有的简单性，这些模型无法根据时间序列模式的周期性变化调整其预测规则。为解决这一挑战，我们提出了一种基于混合专家（Mixture-of-Experts）风格增强的线性中心模型，并提出了混合线性专家（MoLE）方法。MoLE并非训练单一模型，而是训练多个线性中心模型（即专家）以及一个对其输出进行加权混合的路由器模型。整个框架采用端到端训练方式，每个专家学习专注于特定的时间模式，而路由器模型则学习自适应地组合这些专家。实验表明，在我们评估的超过78%的数据集和设置中，MoLE将包括DLinear、RLinear和RMLP在内的线性中心模型的预测误差降低了。通过使用MoLE，现有线性中心模型在68%的PatchTST报告并与之对比的实验中实现了SOTA的LTSF结果，而现有单头线性中心模型仅在25%的案例中达到SOTA。此外，MoLE模型在新发布的Weather2K数据集的所有设置中均达到了SOTA。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日