With climate extremes' rising frequency and intensity, robust analytical tools are crucial to predict their impacts on terrestrial ecosystems. Machine learning techniques show promise but require well-structured, high-quality, and curated analysis-ready datasets. Earth observation datasets comprehensively monitor ecosystem dynamics and responses to climatic extremes, yet the data complexity can challenge the effectiveness of machine learning models. Despite recent progress in deep learning to ecosystem monitoring, there is a need for datasets specifically designed to analyse compound heatwave and drought extreme impact. Here, we introduce the DeepExtremeCubes database, tailored to map around these extremes, focusing on persistent natural vegetation. It comprises over 40,000 spatially sampled small data cubes (i.e. minicubes) globally, with a spatial coverage of 2.5 by 2.5 km. Each minicube includes (i) Sentinel-2 L2A images, (ii) ERA5-Land variables and generated extreme event cube covering 2016 to 2022, and (iii) ancillary land cover and topography maps. The paper aims to (1) streamline data accessibility, structuring, pre-processing, and enhance scientific reproducibility, and (2) facilitate biosphere dynamics forecasting in response to compound extremes.
翻译:随着气候极端事件频率和强度的不断上升,强大的分析工具对于预测其对陆地生态系统的影响至关重要。机器学习技术展现出潜力,但需要结构良好、高质量且经过整理的即用型数据集。地球观测数据集能够全面监测生态系统动态及其对气候极端事件的响应,然而数据的复杂性可能挑战机器学习模型的有效性。尽管深度学习在生态系统监测方面已取得进展,但仍缺乏专门用于分析复合热浪与干旱极端事件影响的数据集。为此,我们推出了DeepExtremeCubes数据库,该库专为围绕此类极端事件进行制图而设计,重点关注持续性自然植被。该数据库包含全球范围内超过40,000个空间采样的小型数据立方体(即微型立方体),空间覆盖范围为2.5公里×2.5公里。每个微型立方体包含:(i)Sentinel-2 L2A影像,(ii)ERA5-Land变量及生成的覆盖2016年至2022年的极端事件立方体,以及(iii)辅助的土地覆盖与地形图。本文旨在:(1)简化数据可访问性、结构化与预处理流程,并提升科学可重复性;(2)促进针对复合极端事件的生物圈动态预测。