Most state-of-the-art approaches for weather and climate modeling are based on physics-informed numerical models of the atmosphere. These approaches aim to model the non-linear dynamics and complex interactions between multiple variables, which are challenging to approximate. Additionally, many such numerical models are computationally intensive, especially when modeling the atmospheric phenomenon at a fine-grained spatial and temporal resolution. Recent data-driven approaches based on machine learning instead aim to directly solve a downstream forecasting or projection task by learning a data-driven functional mapping using deep neural networks. However, these networks are trained using curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of numerical models. We develop and demonstrate ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatio-temporal coverage, and physical groundings. ClimaX extends the Transformer architecture with novel encoding and aggregation blocks that allow effective use of available compute while maintaining general utility. ClimaX is pre-trained with a self-supervised learning objective on climate datasets derived from CMIP6. The pre-trained ClimaX can then be fine-tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatio-temporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in ClimaX results in superior performance on benchmarks for weather forecasting and climate projections, even when pretrained at lower resolutions and compute budgets. The source code is available at https://github.com/microsoft/ClimaX.
翻译:当前最先进的天气与气候建模方法大多基于物理驱动的大气数值模型。这类方法旨在模拟多变量间的非线性动力学与复杂相互作用,但此类过程难以近似。此外,许多数值模型,特别是针对高时空分辨率大气现象建模时,计算成本极高。近年来基于机器学习的数据驱动方法,则试图通过深度神经网络学习数据驱动的函数映射,直接求解下游的预报或预测任务。然而,这类网络通常使用针对特定时空任务精心整理的同质化气候数据集进行训练,因而缺乏数值模型的通用性。我们开发并展示了ClimaX——一个灵活且可泛化的深度学习模型,可用于天气与气候科学,并能够利用涵盖不同变量、时空覆盖范围及物理基础的异构数据集进行训练。ClimaX扩展了Transformer架构,引入了新型编码与聚合模块,在保持通用性的同时实现计算资源的高效利用。该模型基于CMIP6气候数据集,通过自监督学习目标进行预训练。预训练后的ClimaX可通过微调处理广泛的气候与天气任务,包括涉及预训练中未见的大气变量及时空尺度的任务。与现有数据驱动基线相比,我们证明ClimaX的这种通用性在天气预报与气候预测基准测试中展现出更优性能,即使在较低分辨率与计算预算下进行预训练时亦是如此。源代码已开源至https://github.com/microsoft/ClimaX。