Understanding climate dynamics requires going beyond correlations in observational data to uncover the underlying causal process. Latent drivers such as atmospheric processes play a central role in temporal dynamics, while direct causal influences also exist among geographically proximate observed variables. Traditional Causal Representation Learning (CRL) typically focuses on latent factors but overlooks such observable-to-observable causal relations, which limits its applicability to climate analysis. In this paper, we introduce a unified framework that jointly uncovers (i) causal relations among observed variables and (ii) latent driving forces together with their interactions. We establish conditions under which both the hidden dynamic process and the causal structure among observed variables are simultaneously identifiable from time-series data, and our guarantees continue to hold in the nonparametric setting through contextual information that recovers latent variables and causal relations. Building on these insights, we propose CaDRe (Causal Discovery and Representation learning), a time-series generative model with structural constraints that integrates CRL and causal discovery. Experiments on synthetic datasets validate our theoretical results. On real-world climate datasets, CaDRe delivers competitive forecasting accuracy and recovers visualized causal graphs aligned with domain expertise, thereby offering interpretable insights into climate systems. Code is available at https://github.com/MinghaoFu/CaDRe.
翻译:理解气候动态需要超越观测数据中的相关性,深入挖掘其背后的因果过程。大气过程等潜在驱动力在时间动态中起着核心作用,同时,地理邻近的观测变量之间也存在直接的因果影响。传统的因果表征学习(CRL)通常聚焦于潜在因子,却忽视了这种可观测变量之间的因果关联,这限制了其在气候分析中的适用性。本文提出一个统一框架,联合发现:(i)观测变量之间的因果关系;(ii)潜在驱动力及其相互作用。我们确立了可观测变量间的因果结构与隐动态过程同时从时间序列数据中可识别的条件,并且该保证在非参数设定下通过利用上下文信息恢复潜变量与因果关系依然成立。基于这些见解,我们提出CaDRe(因果发现与表征学习),一种具有结构约束的时间序列生成模型,它融合了CRL与因果发现。在合成数据集上的实验验证了我们的理论结果。在真实世界的气候数据集上,CaDRe实现了具有竞争力的预测精度,并恢复了与领域知识一致的因果图可视化,从而为气候系统提供了可解释的洞察。代码见:https://github.com/MinghaoFu/CaDRe