A large number of Deep Learning Weather Prediction (DLWP) architectures -- based on various backbones, including U-Net, Transformer, Graph Neural Network, and Fourier Neural Operator (FNO) -- have demonstrated their potential at forecasting atmospheric states. However, due to differences in training protocols, forecast horizons, and data choices, it remains unclear which (if any) of these methods and architectures are most suitable for weather forecasting and for future model development. Here, we step back and provide a detailed empirical analysis, under controlled conditions, comparing and contrasting the most prominent DLWP models, along with their backbones. We accomplish this by predicting synthetic two-dimensional incompressible Navier-Stokes and real-world global weather dynamics. On synthetic data, we observe favorable performance of FNO, while on the real-world WeatherBench dataset, our results demonstrate the suitability of ConvLSTM and SwinTransformer for short-to-mid-ranged forecasts. For long-ranged weather rollouts of up to 50 years, we observe superior stability and physical soundness in architectures that formulate a spherical data representation, i.e., GraphCast and Spherical FNO. The code is available at https://github.com/amazon-science/dlwp-benchmark.
翻译:大量基于不同主干架构(包括U-Net、Transformer、图神经网络和傅里叶神经算子)的深度学习天气预测模型已展现出预测大气状态的潜力。然而,由于训练方案、预测时间跨度及数据选择的差异,目前仍不清楚这些方法与架构中哪些(若有)最适合天气预报及未来模型开发。本研究通过控制变量条件下的详细实证分析,系统比较了当前主流的深度学习天气预测模型及其主干架构。我们通过预测合成的二维不可压缩纳维-斯托克斯方程和真实全球天气动力学数据实现评估。在合成数据中,傅里叶神经算子表现出优越性能;而在真实世界WeatherBench数据集上,ConvLSTM与SwinTransformer在中短期预报中展现最佳适应性。对于长达50年的长期天气推演,采用球面数据表征的架构(即GraphCast与球面傅里叶神经算子)表现出更优的稳定性与物理合理性。代码已开源:https://github.com/amazon-science/dlwp-benchmark。