Spatiotemporal graph neural networks have shown to be effective in time series forecasting applications, achieving better performance than standard univariate predictors in several settings. These architectures take advantage of a graph structure and relational inductive biases to learn a single (global) inductive model to predict any number of the input time series, each associated with a graph node. Despite the gain achieved in computational and data efficiency w.r.t. fitting a set of local models, relying on a single global model can be a limitation whenever some of the time series are generated by a different spatiotemporal stochastic process. The main objective of this paper is to understand the interplay between globality and locality in graph-based spatiotemporal forecasting, while contextually proposing a methodological framework to rationalize the practice of including trainable node embeddings in such architectures. We ascribe to trainable node embeddings the role of amortizing the learning of specialized components. Moreover, embeddings allow for 1) effectively combining the advantages of shared message-passing layers with node-specific parameters and 2) efficiently transferring the learned model to new node sets. Supported by strong empirical evidence, we provide insights and guidelines for specializing graph-based models to the dynamics of each time series and show how this aspect plays a crucial role in obtaining accurate predictions.
翻译:时空图神经网络在时间序列预测应用中显示出有效性,在多种场景下实现了优于标准单变量预测器的性能。这些架构利用图结构关系归纳偏置来学习单个(全局)归纳模型,以预测任意数量的输入时间序列(每个序列对应一个图节点)。尽管相较于拟合一组局部模型在计算和数据效率上有所提升,但当部分时间序列由不同的时空随机过程生成时,依赖单一全局模型可能成为局限。本文的主要目标是理解图基时空预测中全局性与局部性的相互作用,同时提出一个方法论框架以合理化在此类架构中包含可训练节点嵌入的做法。我们将可训练节点嵌入的作用归因于分摊专用组件的学习成本。此外,嵌入能:1)有效结合共享消息传递层与节点特定参数的优点;2)将学习到的模型高效迁移至新的节点集。基于坚实的实证证据,我们提供了将图基模型专用于每个时间序列动态特性的见解与指南,并展示了该方面在获得精确预测中发挥的关键作用。