Spatio-temporal (ST) prediction is an important and widely used technique in data mining and analytics, especially for ST data in urban systems such as transportation data. In practice, the ST data generation is usually influenced by various latent factors tied to natural phenomena or human socioeconomic activities, impacting specific spatial areas selectively. However, existing ST prediction methods usually do not refine the impacts of different factors, but directly model the entangled impacts of multiple factors. This amplifies the modeling complexity of ST data and compromises model interpretability. To this end, we propose a multi-factor ST prediction task that predicts partial ST data evolution under different factors, and combines them for a final prediction. We make two contributions to this task: an effective theoretical solution and a portable instantiation framework. Specifically, we first propose a theoretical solution called decomposed prediction strategy and prove its effectiveness from the perspective of information entropy theory. On top of that, we instantiate a novel model-agnostic framework, named spatio-temporal graph decomposition learning (STGDL), for multi-factor ST prediction. The framework consists of two main components: an automatic graph decomposition module that decomposes the original graph structure inherent in ST data into subgraphs corresponding to different factors, and a decomposed learning network that learns the partial ST data on each subgraph separately and integrates them for the final prediction. We conduct extensive experiments on four real-world ST datasets of two types of graphs, i.e., grid graph and network graph. Results show that our framework significantly reduces prediction errors of various ST models by 9.41% on average (35.36% at most). Furthermore, a case study reveals the interpretability potential of our framework.
翻译:时空预测是数据挖掘与分析中一项重要且广泛应用的技术,尤其针对城市系统中的交通数据等时空数据。在实际中,时空数据的生成通常受到自然现象或人类社会经济活动等潜在因素的影响,这些因素会选择性地作用于特定空间区域。然而,现有时空预测方法通常不区分不同因素的作用,而是直接建模多个因素的耦合影响。这加剧了时空数据的建模复杂度,并削弱了模型的可解释性。为此,我们提出了一项多因素时空预测任务,该任务预测不同因素作用下的部分时空数据演化,并将其整合得到最终预测结果。我们为该任务做出了两项贡献:一个有效的理论解决方案和一个可移植的实例化框架。具体而言,我们首先提出了一种名为分解预测策略的理论解决方案,并从信息熵理论角度证明了其有效性。在此基础上,我们实例化了一个新颖的模型无关框架,即时空图分解学习(STGDL),用于多因素时空预测。该框架由两个主要组件构成:一个自动图分解模块,将时空数据中固有的原始图结构分解为对应不同因素的子图;以及一个分解学习网络,该网络分别在每个子图上学习部分时空数据,并将其整合以进行最终预测。我们在两种图类型(网格图和网络图)的四个真实时空数据集上进行了大量实验。结果表明,我们的框架平均将多种时空模型的预测误差降低了9.41%(最高达35.36%)。此外,一项案例研究揭示了我们框架的可解释性潜力。