Towards Unifying Diffusion Models for Probabilistic Spatio-Temporal Graph Learning

Spatio-temporal graph learning is a fundamental problem in the Web of Things era, which enables a plethora of Web applications such as smart cities, human mobility and climate analysis. Existing approaches tackle different learning tasks independently, tailoring their models to unique task characteristics. These methods, however, fall short of modeling intrinsic uncertainties in the spatio-temporal data. Meanwhile, their specialized designs limit their universality as general spatio-temporal learning solutions. In this paper, we propose to model the learning tasks in a unified perspective, viewing them as predictions based on conditional information with shared spatio-temporal patterns. Based on this proposal, we introduce Unified Spatio-Temporal Diffusion Models (USTD) to address the tasks uniformly within the uncertainty-aware diffusion framework. USTD is holistically designed, comprising a shared spatio-temporal encoder and attention-based denoising networks that are task-specific. The shared encoder, optimized by a pre-training strategy, effectively captures conditional spatio-temporal patterns. The denoising networks, utilizing both cross- and self-attention, integrate conditional dependencies and generate predictions. Opting for forecasting and kriging as downstream tasks, we design Gated Attention (SGA) and Temporal Gated Attention (TGA) for each task, with different emphases on the spatial and temporal dimensions, respectively. By combining the advantages of deterministic encoders and probabilistic diffusion models, USTD achieves state-of-the-art performances compared to deterministic and probabilistic baselines in both tasks, while also providing valuable uncertainty estimates.

翻译：时空图学习是物联网时代的一个基础性问题，它支撑着智慧城市、人类移动性分析和气候分析等众多网络应用。现有方法独立处理不同的学习任务，针对特定任务特性定制模型。然而，这些方法难以建模时空数据中的内在不确定性。同时，其专业化设计限制了它们作为通用时空学习解决方案的普适性。本文提出从统一视角建模学习任务，将其视为基于共享时空模式的条件信息的预测。基于这一思路，我们提出了统一时空扩散模型（USTD），在不确定性感知的扩散框架内统一处理这些任务。USTD采用整体设计，包含共享的时空编码器和基于注意力机制的任务特定去噪网络。通过预训练策略优化的共享编码器能够有效捕获条件时空模式。利用交叉注意力和自注意力的去噪网络整合条件依赖关系并生成预测。我们选择预测和插值作为下游任务，针对每个任务设计了门控注意力（SGA）和时间门控注意力（TGA），分别侧重于空间维度和时间维度。通过结合确定性编码器和概率性扩散模型的优势，USTD在两项任务中均取得了优于确定性和概率性基线的性能，同时提供了有价值的不确定性估计。