Earth system forecasting has traditionally relied on complex physical models that are computationally expensive and require significant domain expertise. In the past decade, the unprecedented increase in spatiotemporal Earth observation data has enabled data-driven forecasting models using deep learning techniques. These models have shown promise for diverse Earth system forecasting tasks but either struggle with handling uncertainty or neglect domain-specific prior knowledge, resulting in averaging possible futures to blurred forecasts or generating physically implausible predictions. To address these limitations, we propose a two-stage pipeline for probabilistic spatiotemporal forecasting: 1) We develop PreDiff, a conditional latent diffusion model capable of probabilistic forecasts. 2) We incorporate an explicit knowledge alignment mechanism to align forecasts with domain-specific physical constraints. This is achieved by estimating the deviation from imposed constraints at each denoising step and adjusting the transition distribution accordingly. We conduct empirical studies on two datasets: N-body MNIST, a synthetic dataset with chaotic behavior, and SEVIR, a real-world precipitation nowcasting dataset. Specifically, we impose the law of conservation of energy in N-body MNIST and anticipated precipitation intensity in SEVIR. Experiments demonstrate the effectiveness of PreDiff in handling uncertainty, incorporating domain-specific prior knowledge, and generating forecasts that exhibit high operational utility.
翻译:地球系统预报传统上依赖于复杂的物理模型,这些模型计算成本高昂且需要大量领域专业知识。过去十年中,地球观测时空数据的空前增长推动了基于深度学习的数据驱动预报模型的发展。这类模型在多种地球系统预报任务中展现出潜力,但要么难以处理不确定性,要么忽视领域特定的先验知识,导致将多种可能未来平均为模糊的预报结果,或生成物理上不合理的预测。为解决这些局限性,我们提出了一种用于概率性时空预报的两阶段框架:1)构建PreDiff,一种能够进行概率预报的条件潜在扩散模型;2)引入显式知识对齐机制,使预报结果与领域特定的物理约束保持一致。该机制通过估计每个去噪步骤中与施加约束的偏差并相应调整转移分布来实现。我们在两个数据集上开展实证研究:具有混沌行为的合成数据集N-body MNIST,以及真实降水临近预报数据集SEVIR。具体而言,我们在N-body MNIST中施加能量守恒定律,在SEVIR中施加预期降水强度约束。实验表明,PreDiff在处理不确定性、整合领域先验知识以及生成具有高业务实用性的预报方面具有显著效果。