Huge Ensembles Part I: Design of Ensemble Weather Forecasts using Spherical Fourier Neural Operators

Ankur Mahesh,William Collins,Boris Bonev,Noah Brenowitz,Yair Cohen,Joshua Elms,Peter Harrington,Karthik Kashinath,Thorsten Kurth,Joshua North,Travis OBrien,Michael Pritchard,David Pruitt,Mark Risser,Shashank Subramanian,Jared Willard

Studying low-likelihood high-impact extreme weather events in a warming world is a significant and challenging task for current ensemble forecasting systems. While these systems presently use up to 100 members, larger ensembles could enrich the sampling of internal variability. They may capture the long tails associated with climate hazards better than traditional ensemble sizes. Due to computational constraints, it is infeasible to generate huge ensembles (comprised of 1,000-10,000 members) with traditional, physics-based numerical models. In this two-part paper, we replace traditional numerical simulations with machine learning (ML) to generate hindcasts of huge ensembles. In Part I, we construct an ensemble weather forecasting system based on Spherical Fourier Neural Operators (SFNO), and we discuss important design decisions for constructing such an ensemble. The ensemble represents model uncertainty through perturbed-parameter techniques, and it represents initial condition uncertainty through bred vectors, which sample the fastest growing modes of the forecast. Using the European Centre for Medium-Range Weather Forecasts Integrated Forecasting System (IFS) as a baseline, we develop an evaluation pipeline composed of mean, spectral, and extreme diagnostics. Using large-scale, distributed SFNOs with 1.1 billion learned parameters, we achieve calibrated probabilistic forecasts. As the trajectories of the individual members diverge, the ML ensemble mean spectra degrade with lead time, consistent with physical expectations. However, the individual ensemble members' spectra stay constant with lead time. Therefore, these members simulate realistic weather states, and the ML ensemble thus passes a crucial spectral test in the literature. The IFS and ML ensembles have similar Extreme Forecast Indices, and we show that the ML extreme weather forecasts are reliable and discriminating.

翻译：在变暖的世界中研究低概率高影响的极端天气事件，对当前的集合预报系统而言是一项重大且具有挑战性的任务。虽然这些系统目前最多使用100个成员，但更大的集合可以丰富内部变率的采样。相较于传统的集合规模，它们可能更好地捕捉与气候灾害相关的长尾分布。由于计算限制，使用传统的、基于物理的数值模式来生成超大集合（包含1,000至10,000个成员）是不可行的。在这篇由两部分组成的论文中，我们使用机器学习（ML）替代传统的数值模拟来生成超大集合的后报。在第一部分中，我们构建了一个基于球面傅里叶神经算子（SFNO）的集合天气预报系统，并讨论了构建此类集合的重要设计决策。该集合通过扰动参数技术表示模式不确定性，并通过繁殖向量表示初始条件不确定性，后者采样了预报中增长最快的模态。以欧洲中期天气预报中心的集成预报系统（IFS）为基准，我们开发了一个由均值、谱和极端诊断组成的评估流程。使用具有11亿学习参数的大规模分布式SFNO，我们实现了经过校准的概率预报。随着各成员轨迹的发散，ML集合平均谱随预报时效增长而退化，这与物理预期一致。然而，各集合成员的谱随预报时效保持恒定。因此，这些成员模拟了真实的天气状态，从而使得ML集合通过了文献中一项关键的谱检验。IFS和ML集合具有相似的极端预报指数，并且我们证明了ML的极端天气预报是可靠且具有区分能力的。