With climate change-related extreme events on the rise, high dimensional Earth observation data presents a unique opportunity for forecasting and understanding impacts on ecosystems. This is, however, impeded by the complexity of processing, visualizing, modeling, and explaining this data. To showcase how this challenge can be met, here we train a convolutional long short-term memory-based architecture on the novel DeepExtremeCubes dataset. DeepExtremeCubes includes around 40,000 long-term Sentinel-2 minicubes (January 2016-October 2022) worldwide, along with labeled extreme events, meteorological data, vegetation land cover, and topography map, sampled from locations affected by extreme climate events and surrounding areas. When predicting future reflectances and vegetation impacts through kernel normalized difference vegetation index, the model achieved an R$^2$ score of 0.9055 in the test set. Explainable artificial intelligence was used to analyze the model's predictions during the October 2020 Central South America compound heatwave and drought event. We chose the same area exactly one year before the event as counterfactual, finding that the average temperature and surface pressure are generally the best predictors under normal conditions. In contrast, minimum anomalies of evaporation and surface latent heat flux take the lead during the event. A change of regime is also observed in the attributions before the event, which might help assess how long the event was brewing before happening. The code to replicate all experiments and figures in this paper is publicly available at https://github.com/DeepExtremes/txyXAI
翻译:随着气候变化相关极端事件的增多,高维地球观测数据为预测和理解其对生态系统的影响提供了独特机遇。然而,数据处理、可视化、建模与解释的复杂性阻碍了这一进程。为展示如何应对这一挑战,本文基于新型DeepExtremeCubes数据集训练了一种卷积长短期记忆网络架构。DeepExtremeCubes包含全球范围内约40,000个长期Sentinel-2微型数据立方体(2016年1月至2022年10月),以及从受极端气候事件影响区域及周边地区采样的标记极端事件、气象数据、植被土地覆盖和地形图。在通过核归一化植被指数预测未来反射率与植被影响时,该模型在测试集上取得了R$^2$=0.9055的评分。研究采用可解释人工智能技术,分析了模型对2020年10月南美洲中部复合热浪与干旱事件的预测结果。我们选取事件发生前整一年的相同区域作为反事实对照,发现平均气温与地表气压通常是正常条件下的最佳预测因子;而在事件期间,蒸发量与地表潜热通量的最小异常值则成为主导因素。事件发生前归因分布的状态转变现象也被观测到,这可能有助于评估事件酝酿的持续时间。本文所有实验与图表复现代码已公开于https://github.com/DeepExtremes/txyXAI。