Kilometer-scale Earth system models are essential for capturing local climate change. However, these models are computationally expensive and produce petabyte-scale outputs, which limits their utility for applications such as probabilistic risk assessment. Here, we present the Field-Space Autoencoder, a scalable climate emulation framework based on a spherical compression model that overcomes these challenges. By utilizing Field-Space Attention, the model efficiently operates on native climate model output and therefore avoids geometric distortions caused by forcing spherical data onto Euclidean grids. This approach preserves physical structures significantly better than convolutional baselines. By producing a structured compressed field, it serves as a good baseline for downstream generative emulation. In addition, the model can perform zero-shot super-resolution that maps low-resolution large ensembles and scarce high-resolution data into a shared representation. We train a generative diffusion model on these compressed fields. The model can simultaneously learn internal variability from abundant low-resolution data and fine-scale physics from sparse high-resolution data. Our work bridges the gap between the high volume of low-resolution ensemble statistics and the scarcity of high-resolution physical detail.
翻译:公里级地球系统模型对于捕捉局地气候变化至关重要。然而,这些模型计算成本高昂,且产生拍字节级别的输出,这限制了其在概率风险评估等应用中的实用性。本文提出场空间自编码器,这是一种基于球面压缩模型的可扩展气候模拟框架,能够克服上述挑战。通过利用场空间注意力机制,该模型能够高效处理原生气候模型输出,从而避免了将球面数据强行映射到欧几里得网格所导致的几何畸变。该方法比卷积基线模型显著更好地保留了物理结构。通过生成结构化的压缩场,它为下游生成式模拟提供了良好的基线。此外,该模型能够执行零样本超分辨率任务,将低分辨率大集合数据与稀缺的高分辨率数据映射到共享表示空间中。我们在这些压缩场上训练了一个生成扩散模型。该模型能够同时从丰富的低分辨率数据中学习内部变异性,并从稀疏的高分辨率数据中学习精细尺度的物理规律。我们的工作弥合了海量低分辨率集合统计量与稀缺高分辨率物理细节之间的鸿沟。