Ensuring both transparency and safety is critical when deploying Deep Neural Networks (DNNs) in high-risk applications, such as medicine. The field of explainable AI (XAI) has proposed various methods to comprehend the decision-making processes of opaque DNNs. However, only few XAI methods are suitable of ensuring safety in practice as they heavily rely on repeated labor-intensive and possibly biased human assessment. In this work, we present a novel post-hoc concept-based XAI framework that conveys besides instance-wise (local) also class-wise (global) decision-making strategies via prototypes. What sets our approach apart is the combination of local and global strategies, enabling a clearer understanding of the (dis-)similarities in model decisions compared to the expected (prototypical) concept use, ultimately reducing the dependence on human long-term assessment. Quantifying the deviation from prototypical behavior not only allows to associate predictions with specific model sub-strategies but also to detect outlier behavior. As such, our approach constitutes an intuitive and explainable tool for model validation. We demonstrate the effectiveness of our approach in identifying out-of-distribution samples, spurious model behavior and data quality issues across three datasets (ImageNet, CUB-200, and CIFAR-10) utilizing VGG, ResNet, and EfficientNet architectures. Code is available on https://github.com/maxdreyer/pcx.
翻译:在医学等高危应用中部署深度神经网络(DNNs)时,确保透明性与安全性至关重要。可解释人工智能(XAI)领域已提出多种方法用于理解不透明DNN的决策过程。然而,当前大多数XAI方法依赖重复性、劳动密集型且可能带有偏见的人工评估,实际中难以保障安全性。本文提出一种新型事后概念型XAI框架,通过原型同时传递实例级(局部)和类别级(全局)的决策策略。本方法的独特之处在于结合局部与全局策略,能更清晰地揭示模型决策与预期(原型)概念使用之间的(异)同,最终减少对人工长期评估的依赖。量化与原型行为的偏差不仅可将预测关联至特定模型子策略,还能检测异常行为。因此,本方法构成了一种直观且可解释的模型验证工具。我们利用VGG、ResNet和EfficientNet架构,在ImageNet、CUB-200和CIFAR-10三个数据集上验证了该方法在识别分布外样本、虚假模型行为及数据质量问题方面的有效性。代码已开源:https://github.com/maxdreyer/pcx。