Subseasonal forecasting of the weather two to six weeks in advance is critical for resource allocation and climate adaptation but poses many challenges for the forecasting community. At this forecast horizon, physics-based dynamical models have limited skill, and the targets for prediction depend in a complex manner on both local weather and global climate variables. Recently, machine learning methods have shown promise in advancing the state of the art but only at the cost of complex data curation, integrating expert knowledge with aggregation across multiple relevant data sources, file formats, and temporal and spatial resolutions. To streamline this process and accelerate future development, we introduce SubseasonalClimateUSA, a curated dataset for training and benchmarking subseasonal forecasting models in the United States. We use this dataset to benchmark a diverse suite of subseasonal models, including operational dynamical models, classical meteorological baselines, and ten state-of-the-art machine learning and deep learning-based methods from the literature. Overall, our benchmarks suggest simple and effective ways to extend the accuracy of current operational models. SubseasonalClimateUSA is regularly updated and accessible via the https://github.com/microsoft/subseasonal_data/ Python package.
翻译:提前两至六周对天气进行次季节预测,对于资源调配和气候适应至关重要,但给预测界带来了诸多挑战。在此预报时间尺度上,基于物理机制的动力学模型技能有限,预测目标同时受局地天气和全球气候变量的复杂影响。近来,机器学习方法展现出推进前沿水平的潜力,但代价是需要复杂的数据整理,整合多个相关数据源的专家知识、不同文件格式及时空分辨率。为简化这一流程并加速未来发展,我们推出SubseasonalClimateUSA——一个专为美国次季节预测模型训练与基准测试而设计的整理数据集。利用该数据集,我们对一系列多样化的次季节模型进行了基准测试,包括运行中的动力学模型、经典气象基线模型以及文献中十种最先进的机器学习和深度学习方法。总体来看,我们的基准测试揭示了提升当前运行模型精度的简单有效途径。SubseasonalClimateUSA定期更新,可通过Python包https://github.com/microsoft/subseasonal_data/ 获取。