Modern data-driven surrogate models for weather forecasting provide accurate short-term predictions but inaccurate and nonphysical long-term forecasts. This paper investigates online weather prediction using machine learning surrogates supplemented with partial and noisy observations. We empirically demonstrate and theoretically justify that, despite the long-time instability of the surrogates and the sparsity of the observations, filtering estimates can remain accurate in the long-time horizon. As a case study, we integrate FourCastNet, a state-of-the-art weather surrogate model, within a variational data assimilation framework using partial, noisy ERA5 data. Our results show that filtering estimates remain accurate over a year-long assimilation window and provide effective initial conditions for forecasting tasks, including extreme event prediction.
翻译:现代数据驱动的天气预报代理模型能够提供准确的短期预测,但会产生不准确且非物理的长期预报。本文研究了利用部分含噪声观测数据补充的机器学习代理模型进行在线天气预测。我们通过实证演示和理论论证表明,尽管代理模型存在长期不稳定性且观测数据稀疏,滤波估计在长期时间范围内仍能保持准确性。作为案例研究,我们将最先进的天气代理模型FourCastNet集成到变分数据同化框架中,并使用部分含噪声的ERA5数据。结果表明,滤波估计在长达一年的同化窗口内保持准确,并为包括极端事件预测在内的预报任务提供了有效的初始条件。