Explainability plays a crucial role in providing a more comprehensive understanding of deep learning models' behaviour. This allows for thorough validation of the model's performance, ensuring that its decisions are based on relevant visual indicators and not biased toward irrelevant patterns existing in training data. However, existing methods provide only instance-level explainability, which requires manual analysis of each sample. Such manual review is time-consuming and prone to human biases. To address this issue, the concept of second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level. SOXAI automates the analysis of the connections between quantitative explanations and dataset biases by identifying prevalent concepts. In this work, we explore the use of this higher-level interpretation of a deep neural network's behaviour to allows us to "explain the explainability" for actionable insights. Specifically, we demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
翻译:可解释性在提供对深度学习模型行为更全面的理解中扮演关键角色。它有助于彻底验证模型性能,确保其决策基于相关视觉指标,而非偏向训练数据中存在的无关模式。然而,现有方法仅提供实例级可解释性,需要人工分析每个样本。这种手动审查既耗时又易受人为偏差影响。为解决此问题,近期提出了二阶可解释人工智能(SOXAI)概念,将可解释人工智能(XAI)从实例级扩展至数据集级。SOXAI通过识别普遍概念,自动化分析定量解释与数据集偏差之间的关联。在本工作中,我们探索利用这种对深度神经网络行为的高层解释,以“解释可解释性”并获取实际指导。具体而言,我们首次通过分类与分割实例证明:基于SOXAI提供的实践指导从训练集中剔除无关概念可提升模型性能。