Decision-oriented benchmarking to transform AI weather forecast access: Application to the Indian monsoon

Rajat Masiwal,Colin Aitken,Adam Marchakitus,Mayank Gupta,Katherine Kowal,Hamid A. Pahlavan,Tyler Yang,Y. Qiang Sun,Michael Kremer,Amir Jina,William R. Boos,Pedram Hassanzadeh

Artificial intelligence weather prediction (AIWP) models now often outperform traditional physics-based models on common metrics while requiring orders-of-magnitude less computing resources and time. Open-access AIWP models thus hold promise as transformational tools for helping low- and middle-income populations make decisions in the face of high-impact weather shocks. Yet, current approaches to evaluating AIWP models focus mainly on aggregated meteorological metrics without considering local stakeholders' needs in decision-oriented, operational frameworks. Here, we introduce such a framework that connects meteorology, AI, and social sciences. As an example, we apply it to the 150-year-old problem of Indian monsoon forecasting, focusing on benefits to rain-fed agriculture, which is highly susceptible to climate change. AIWP models skillfully predict an agriculturally relevant onset index at regional scales weeks in advance when evaluated out-of-sample using deterministic and probabilistic metrics. This framework informed a government-led effort in 2025 to send 38 million Indian farmers AI-based monsoon onset forecasts, which captured an unusual weeks-long pause in monsoon progression. This decision-oriented benchmarking framework provides a key component of a blueprint for harnessing the power of AIWP models to help large vulnerable populations adapt to weather shocks in the face of climate variability and change.

翻译：人工智能天气预报模型目前在常用指标上往往优于传统基于物理的模型，同时所需计算资源和时间低数个数量级。因此，开放获取的人工智能天气预报模型有望成为变革性工具，帮助中低收入人群在面对高影响天气冲击时做出决策。然而，当前评估人工智能天气预报模型的方法主要关注聚合气象指标，未在决策导向的业务框架中考虑本地利益相关者的需求。本文提出一个连接气象学、人工智能与社会科学的框架。作为示例，我们将其应用于具有150年历史的印度季风预报问题，重点关注对气候变化高度敏感的雨养农业的效益。当使用确定性和概率性指标进行样本外评估时，人工智能天气预报模型能够提前数周在区域尺度上准确预测与农业相关的季风爆发指数。该框架为2025年一项政府主导的计划提供了依据，该计划向3800万印度农民发送了基于人工智能的季风爆发预报，成功捕捉到一次异常的季风进程数周停滞现象。这一决策导向的基准测试框架为利用人工智能天气预报模型的力量，帮助大规模脆弱人群适应气候变异和变化背景下的天气冲击，提供了蓝图的关键组成部分。