We propose the Taylorformer for time series and other random processes. Its two key components are: 1) the LocalTaylor wrapper to learn how and when to use Taylor series-based approximations for predictions, and 2) the MHA-X attention block which makes predictions in a way inspired by how Gaussian Processes' mean predictions are linear smoothings of contextual data. Taylorformer outperforms the state-of-the-art on several forecasting datasets, including electricity, oil temperatures and exchange rates with at least 14% improvement in MSE on all tasks, and better likelihood on 5/6 classic Neural Process tasks such as meta-learning 1D functions. Taylorformer combines desirable features from the Neural Process (uncertainty-aware predictions and consistency) and forecasting (predictive accuracy) literature, two previously distinct bodies.
翻译:我们提出泰勒梅尔模型,用于处理时间序列及其他随机过程。其两个核心组件包括:1)局部泰勒包装器,通过学习何时及如何利用基于泰勒级数的近似进行预测;2)多跳交叉注意力模块,其预测方式受高斯过程均值预测对上下文数据进行线性平滑的启发。泰勒梅尔在多个预测数据集上均超越现有最优方法,包括电力负荷、油温及汇率数据,所有任务的均方误差至少降低14%,并在5/6经典神经过程任务(如一元函数元学习)中取得更优似然值。该模型融合了神经过程(不确定性感知预测与一致性)与预测方法(预测精度)这两大此前独立领域的特性。