Earth observation (EO) applications involving complex and heterogeneous data sources are commonly approached with machine learning models. However, there is a common assumption that data sources will be persistently available. Different situations could affect the availability of EO sources, like noise, clouds, or satellite mission failures. In this work, we assess the impact of missing temporal and static EO sources in trained models across four datasets with classification and regression tasks. We compare the predictive quality of different methods and find that some are naturally more robust to missing data. The Ensemble strategy, in particular, achieves a prediction robustness up to 100%. We evidence that missing scenarios are significantly more challenging in regression than classification tasks. Finally, we find that the optical view is the most critical view when it is missing individually.
翻译:地球观测应用涉及复杂且异构的数据源,通常采用机器学习模型进行处理。然而,普遍存在一个假设:数据源将持久可用。多种情况可能影响地球观测数据源的可用性,例如噪声、云层或卫星任务故障。本研究基于四个包含分类与回归任务的数据集,评估了训练模型中时间与静态地球观测数据源缺失所产生的影响。我们比较了不同方法的预测质量,发现某些方法对数据缺失具有更高的自然鲁棒性。其中,集成策略的预测鲁棒性最高可达100%。实验证明,缺失场景在回归任务中的挑战性显著高于分类任务。此外,我们发现光学视图在单独缺失时是最关键的影响因素。