Counterfactual explanations are an emerging tool to enhance interpretability of deep learning models. Given a sample, these methods seek to find and display to the user similar samples across the decision boundary. In this paper, we propose a generative adversarial counterfactual approach for satellite image time series in a multi-class setting for the land cover classification task. One of the distinctive features of the proposed approach is the lack of prior assumption on the targeted class for a given counterfactual explanation. This inherent flexibility allows for the discovery of interesting information on the relationship between land cover classes. The other feature consists of encouraging the counterfactual to differ from the original sample only in a small and compact temporal segment. These time-contiguous perturbations allow for a much sparser and, thus, interpretable solution. Furthermore, plausibility/realism of the generated counterfactual explanations is enforced via the proposed adversarial learning strategy.
翻译:反事实解释是增强深度学习模型可解释性的新兴工具。针对给定样本,这些方法旨在寻找并显示决策边界两侧的相似样本。本文提出了一种生成对抗反事实方法,应用于多类别设置下的卫星图像时间序列土地覆盖分类任务。该方法的一个显著特点是对给定反事实解释的目标类别无先验假设。这种固有灵活性有助于发现土地覆盖类别之间关系的有趣信息。另一特点是鼓励反事实样本仅在紧凑的时间片段中与原始样本存在差异。这种时间连续的扰动使得解更为稀疏,因而更具可解释性。此外,通过所提出的对抗学习策略,确保了生成的反事实解释的合理性与真实性。