As an important application of spatio-temporal (ST) data, ST traffic forecasting plays a crucial role in improving urban travel efficiency and promoting sustainable development. In practice, the dynamics of traffic data frequently undergo distributional shifts attributed to external factors such as time evolution and spatial differences. This entails forecasting models to handle the out-of-distribution (OOD) issue where test data is distributed differently from training data. In this work, we first formalize the problem by constructing a causal graph of past traffic data, future traffic data, and external ST contexts. We reveal that the failure of prior arts in OOD traffic data is due to ST contexts acting as a confounder, i.e., the common cause for past data and future ones. Then, we propose a theoretical solution named Disentangled Contextual Adjustment (DCA) from a causal lens. It differentiates invariant causal correlations against variant spurious ones and deconfounds the effect of ST contexts. On top of that, we devise a Spatio-Temporal sElf-superVised dEconfounding (STEVE) framework. It first encodes traffic data into two disentangled representations for associating invariant and variant ST contexts. Then, we use representative ST contexts from three conceptually different perspectives (i.e., temporal, spatial, and semantic) as self-supervised signals to inject context information into both representations. In this way, we improve the generalization ability of the learned context-oriented representations to OOD ST traffic forecasting. Comprehensive experiments on four large-scale benchmark datasets demonstrate that our STEVE consistently outperforms the state-of-the-art baselines across various ST OOD scenarios.
翻译:作为时空数据的重要应用,时空交通预测在提升城市出行效率与促进可持续发展方面扮演关键角色。在实践中,受时间演变和空间差异等外部因素影响,交通数据的动态分布频繁发生偏移。这要求预测模型能够处理测试数据分布与训练数据不同的分布外(OOD)问题。本研究首先通过构建历史交通数据、未来交通数据及外部时空上下文的因果图形式化该问题。我们揭示出,现有方法在OOD交通数据中失效的原因在于时空上下文扮演了混杂因子角色——即历史数据与未来数据的共同原因。进而,我们从因果视角提出名为解耦上下文调整(DCA)的理论解决方案,通过区分不变因果关联与变化伪关联,消除时空上下文的混淆效应。在此基础上,我们设计了时空自监督去混淆(STEVE)框架。该框架首先将交通数据编码为两种解耦表征,分别关联不变与变化的时空上下文;随后,从时间、空间和语义三个概念性视角选取代表性时空上下文作为自监督信号,将上下文信息注入两种表征。通过这种方式,我们提升了面向OOD时空交通预测的上下文导向表征的泛化能力。在四个大规模基准数据集上的综合实验表明,我们的STEVE框架在各类时空OOD场景中均持续超越当前最优基线方法。