Traffic Forecasting on New Roads Unseen in the Training Data Using Spatial Contrastive Pre-Training

New roads are being constructed all the time. However, the capabilities of previous deep forecasting models to generalize to new roads not seen in the training data (unseen roads) are rarely explored. In this paper, we introduce a novel setup called a spatio-temporal (ST) split to evaluate the models' capabilities to generalize to unseen roads. In this setup, the models are trained on data from a sample of roads, but tested on roads not seen in the training data. Moreover, we also present a novel framework called Spatial Contrastive Pre-Training (SCPT) where we introduce a spatial encoder module to extract latent features from unseen roads during inference time. This spatial encoder is pre-trained using contrastive learning. During inference, the spatial encoder only requires two days of traffic data on the new roads and does not require any re-training. We also show that the output from the spatial encoder can be used effectively to infer latent node embeddings on unseen roads during inference time. The SCPT framework also incorporates a new layer, named the spatially gated addition (SGA) layer, to effectively combine the latent features from the output of the spatial encoder to existing backbones. Additionally, since there is limited data on the unseen roads, we argue that it is better to decouple traffic signals to trivial-to-capture periodic signals and difficult-to-capture Markovian signals, and for the spatial encoder to only learn the Markovian signals. Finally, we empirically evaluated SCPT using the ST split setup on four real-world datasets. The results showed that adding SCPT to a backbone consistently improves forecasting performance on unseen roads. More importantly, the improvements are greater when forecasting further into the future. The codes are available on GitHub: \burl{https://github.com/cruiseresearchgroup/forecasting-on-new-roads}.

翻译：新道路在不断建设。然而，先前的深度预测模型能否泛化到训练数据中未见的新道路（unseen roads）却鲜少被探讨。本文提出了一种名为时空（ST）分割的新设置，用于评估模型对未见道路的泛化能力。在该设置下，模型基于部分道路的数据进行训练，但测试时则使用训练数据中未曾出现的道路。此外，我们还提出了一种名为空间对比预训练（SCPT）的新框架，其中引入了一个空间编码器模块，用于在推理阶段提取未见道路的潜在特征。该空间编码器通过对比学习进行预训练。在推理时，空间编码器仅需新道路上两天的交通数据，无需任何重新训练。我们还证明，空间编码器的输出可有效用于推理阶段推断未见道路上的潜在节点嵌入。SCPT框架还整合了一个新层，即空间门控加法（SGA）层，用于将空间编码器输出的潜在特征与现有骨干网络有效结合。此外，鉴于未见道路上的数据有限，我们认为将交通信号分解为易捕获的周期信号和难捕获的马尔可夫信号更为合理，并使空间编码器仅学习马尔可夫信号。最后，我们在四个真实世界数据集上使用ST分割设置对SCPT进行了实证评估。结果表明，将SCPT添加到骨干网络后，一致地提升了未见道路的预测性能。更重要的是，在预测更远的未来时，改进效果更为显著。代码已发布于GitHub：\burl{https://github.com/cruiseresearchgroup/forecasting-on-new-roads}。