Weather forecasts sit upstream of high-stakes decisions in domains such as grid operations, aviation, agriculture, and emergency response. Yet forecast users often face a difficult trade-off. Many decision-relevant targets are functionals of the atmospheric state variables, such as extrema, accumulations, and threshold exceedances, rather than state variables themselves. As a result, users must estimate these targets via post-processing, which can be suboptimal and can introduce structural bias. The core issue is that decisions depend on distributions over these functionals that the model is not trained to learn directly. In this work, we introduce GEM-2, a probabilistic transformer that jointly learns global atmospheric dynamics alongside a suite of variables that users directly act upon. Using this training recipe, we show that a lightweight (~275M params) and computationally efficient (~20-100x training speedup relative to state-of-the-art) transformer trained on the CRPS objective can directly outperform operational numerical weather prediction (NWP) models and be competitive with ML models that rely on expensive multi-step diffusion processes or require bespoke multi-stage fine-tuning strategies. We further demonstrate state-of-the-art economic value metrics under decision-theoretic evaluation, stable convergence to climatology at S2S and seasonal timescales, and a surprising insensitivity to many commonly assumed architectural and training design choices.
翻译:天气预报在电网运营、航空、农业及应急响应等高风险决策领域处于上游关键位置。然而,预报用户常面临艰难的权衡:许多与决策相关的目标量(如极值、累积量和阈值超限量)是大气状态变量的泛函,而非状态变量本身。因此,用户必须通过后处理来估算这些目标,但这种方法可能次优并引入结构性偏差。核心问题在于,决策依赖于这些泛函的分布,而现有模型并未被训练以直接学习此类分布。本研究提出GEM-2——一种概率Transformer,能够联合学习全球大气动力学与一组用户直接依据其行动的变量。通过该训练方案,我们证明:一个轻量级(约2.75亿参数)且计算高效(相较于前沿方法实现约20-100倍训练加速)的Transformer,在CRPS目标函数下训练后,可直接超越业务化数值天气预报(NWP)模型,并与依赖昂贵多步扩散过程或需要定制多阶段微调策略的机器学习模型相竞争。我们进一步展示了在决策理论评估下达到前沿水平的经济价值指标、在次季节至季节尺度上稳定收敛于气候态的特性,以及对许多常用架构与训练设计选择表现出令人意外的低敏感性。