The rapid rise of deep learning (DL) in numerical weather prediction (NWP) has led to a proliferation of models which forecast atmospheric variables with comparable or superior skill than traditional physics-based NWP. However, among these leading DL models, there is a wide variance in both the training settings and architecture used. Further, the lack of thorough ablation studies makes it hard to discern which components are most critical to success. In this work, we show that it is possible to attain high forecast skill even with relatively off-the-shelf architectures, simple training procedures, and moderate compute budgets. Specifically, we train a minimally modified SwinV2 transformer on ERA5 data, and find that it attains superior forecast skill when compared against IFS. We present some ablations on key aspects of the training pipeline, exploring different loss functions, model sizes and depths, and multi-step fine-tuning to investigate their effect. We also examine the model performance with metrics beyond the typical ACC and RMSE, and investigate how the performance scales with model size.
翻译:深度学习在数值天气预报领域的快速发展催生了大量模型,其预测大气变量的能力与传统基于物理的数值天气预报相当甚至更优。然而,这些领先的深度学习模型在训练设置和架构方面存在显著差异。此外,缺乏全面的消融研究使得我们难以辨别哪些组件对成功最为关键。本文证明,即使使用相对现成的架构、简单的训练流程和适中的计算预算,也能够实现高水平的预测技能。具体而言,我们在ERA5数据上训练了一个经过最小化修改的SwinV2 Transformer模型,发现其达到了优于IFS的预测技能。我们针对训练流程的关键方面进行了消融实验,探索不同损失函数、模型规模与深度,以及多步微调对其效果的影响。同时,我们使用除典型ACC和RMSE之外的指标评估模型性能,并研究性能如何随模型规模扩展。