Comparing and Contrasting Deep Learning Weather Prediction Backbones on Navier-Stokes and Atmospheric Dynamics

Remarkable progress in the development of Deep Learning Weather Prediction (DLWP) models positions them to become competitive with traditional numerical weather prediction (NWP) models. Indeed, a wide number of DLWP architectures -- based on various backbones, including U-Net, Transformer, Graph Neural Network (GNN), and Fourier Neural Operator (FNO) -- have demonstrated their potential at forecasting atmospheric states. However, due to differences in training protocols, forecast horizons, and data choices, it remains unclear which (if any) of these methods and architectures are most suitable for weather forecasting and for future model development. Here, we step back and provide a detailed empirical analysis, under controlled conditions, comparing and contrasting the most prominent DLWP models, along with their backbones. We accomplish this by predicting synthetic two-dimensional incompressible Navier-Stokes and real-world global weather dynamics. In terms of accuracy, memory consumption, and runtime, our results illustrate various tradeoffs. For example, on synthetic data, we observe favorable performance of FNO; and on the real-world WeatherBench dataset, our results demonstrate the suitability of ConvLSTM and SwinTransformer for short-to-mid-ranged forecasts. For long-ranged weather rollouts of up to 365 days, we observe superior stability and physical soundness in architectures that formulate a spherical data representation, i.e., GraphCast and Spherical FNO. In addition, we observe that all of these model backbones "saturate," i.e., none of them exhibit so-called neural scaling, which highlights an important direction for future work on these and related models. The code is available at https://github.com/amazon-science/dlwp-benchmark.

翻译：深度学习天气预报（DLWP）模型的显著进展使其有望与传统数值天气预报（NWP）模型相竞争。事实上，基于多种骨干架构（包括U-Net、Transformer、图神经网络（GNN）和傅里叶神经算子（FNO））的大量DLWP模型已展现出预测大气状态的潜力。然而，由于训练方案、预测时间跨度和数据选择的差异，目前尚不清楚这些方法和架构中哪些（如果有）最适合天气预报及未来模型开发。本文通过受控条件下的详细实证分析，比较并对比了最主流的DLWP模型及其骨干架构。我们通过预测合成的二维不可压缩Navier-Stokes流和真实全球天气动力学数据实现这一目标。在精度、内存消耗和运行时间方面，我们的结果揭示了多种权衡关系。例如，在合成数据上，我们观察到FNO具有优越性能；在真实世界的WeatherBench数据集上，结果表明ConvLSTM和SwinTransformer适合中短期预报。对于长达365天的长期天气推演，采用球面数据表征的架构（即GraphCast和球面FNO）表现出更优的稳定性和物理合理性。此外，我们发现所有模型骨干均存在“饱和”现象，即均未呈现所谓的神经缩放规律，这为这些模型及相关研究的未来工作指明了重要方向。代码发布于https://github.com/amazon-science/dlwp-benchmark。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日