DAM: Towards A Foundation Model for Time Series Forecasting

It is challenging to scale time series forecasting models such that they forecast accurately for multiple distinct domains and datasets, all with potentially different underlying collection procedures (e.g., sample resolution), patterns (e.g., periodicity), and prediction requirements (e.g., reconstruction vs. forecasting). We call this general task universal forecasting. Existing methods usually assume that input data is regularly sampled, and they forecast to pre-determined horizons, resulting in failure to generalise outside of the scope of their training. We propose the DAM - a neural model that takes randomly sampled histories and outputs an adjustable basis composition as a continuous function of time for forecasting to non-fixed horizons. It involves three key components: (1) a flexible approach for using randomly sampled histories from a long-tail distribution, that enables an efficient global perspective of the underlying temporal dynamics while retaining focus on the recent history; (2) a transformer backbone that is trained on these actively sampled histories to produce, as representational output, (3) the basis coefficients of a continuous function of time. We show that a single univariate DAM, trained on 25 time series datasets, either outperformed or closely matched existing SoTA models at multivariate long-term forecasting across 18 datasets, including 8 held-out for zero-shot transfer, even though these models were trained to specialise for each dataset-horizon combination. This single DAM excels at zero-shot transfer and very-long-term forecasting, performs well at imputation, is interpretable via basis function composition and attention, can be tuned for different inference-cost requirements, is robust to missing and irregularly sampled data {by design}.

翻译：构建一个能够准确预测多个不同领域和数据集的时间序列预测模型具有挑战性，这些领域和数据集可能具有不同的底层收集过程（例如，采样分辨率）、模式（例如，周期性）和预测需求（例如，重构与预测）。我们将此通用任务称为通用预测。现有方法通常假设输入数据是规则采样的，并且它们预测到预定的时间范围，导致无法泛化到其训练范围之外。我们提出了DAM——一种神经模型，它接受随机采样的历史数据，并输出一个可调整的基函数组合，作为时间的连续函数，用于预测非固定的时间范围。它包含三个关键组件：(1) 一种灵活的方法，用于利用从长尾分布中随机采样的历史数据，这能够在保持对近期历史关注的同时，高效地获取底层时间动态的全局视角；(2) 一个Transformer主干网络，在这些主动采样的历史数据上进行训练，以产生作为表征输出的(3) 时间连续函数的基系数。我们证明，一个单变量DAM模型，在25个时间序列数据集上训练后，在18个数据集（包括8个用于零样本迁移的保留数据集）的多变量长期预测任务中，要么优于、要么与现有的最先进模型表现相当，尽管这些模型是专门为每个数据集-时间范围组合进行训练的。这个单一的DAM模型在零样本迁移和超长期预测方面表现出色，在插补任务上表现良好，通过基函数组合和注意力机制具有可解释性，可以根据不同的推理成本需求进行调整，并且通过设计对缺失和不规则采样数据具有鲁棒性。

相关内容

DAM

关注 0

Discrete Applied Mathematics的目的是汇集算法和应用离散数学不同领域的研究论文，以及组合数学在信息学和科学技术各个领域的应用。发表在期刊上的文章可以是研究论文、简短笔记、调查报告，也可以是研究问题。“传播”部分将致力于尽可能快地出版最近的研究成果，这些成果由编辑委员会的一名成员检查和推荐出版。《华尔街日报》还将出版数量有限的图书公告和会议记录。这些程序将得到充分的裁决，并遵守《华尔街日报》的正常标准。官网链接：https://www.sciencedirect.com/journal/discrete-applied-mathematics/about/aims-and-scope

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日