Ensuring both transparency and safety is critical when deploying Deep Neural Networks (DNNs) in high-risk applications, such as medicine. The field of explainable AI (XAI) has proposed various methods to comprehend the decision-making processes of opaque DNNs. However, only few XAI methods are suitable of ensuring safety in practice as they heavily rely on repeated labor-intensive and possibly biased human assessment. In this work, we present a novel post-hoc concept-based XAI framework that conveys besides instance-wise (local) also class-wise (global) decision-making strategies via prototypes. What sets our approach apart is the combination of local and global strategies, enabling a clearer understanding of the (dis-)similarities in model decisions compared to the expected (prototypical) concept use, ultimately reducing the dependence on human long-term assessment. Quantifying the deviation from prototypical behavior not only allows to associate predictions with specific model sub-strategies but also to detect outlier behavior. As such, our approach constitutes an intuitive and explainable tool for model validation. We demonstrate the effectiveness of our approach in identifying out-of-distribution samples, spurious model behavior and data quality issues across three datasets (ImageNet, CUB-200, and CIFAR-10) utilizing VGG, ResNet, and EfficientNet architectures. Code is available on https://github.com/maxdreyer/pcx.
翻译:在医学等高风险应用场景中部署深度神经网络(DNN)时,确保其透明性和安全性至关重要。可解释人工智能(XAI)领域已提出多种方法来理解不透明DNN的决策过程。然而,现有XAI方法大多依赖重复性高、易引入人工偏差的长期评估,难以在实践中保障安全性。本文提出一种新颖的事后概念级XAI框架,通过原型同时传递实例级(局部)和类别级(全局)决策策略。该方法的独特之处在于将局部与全局策略相结合,能够更清晰地揭示模型决策与预期(原型)概念使用之间的(非)相似性,从而减少对人工长期评估的依赖。通过量化与原型行为的偏差,不仅能将预测结果与特定模型子策略关联,还能检测异常行为。由此,该方法构成了模型验证的直观可解释工具。我们利用VGG、ResNet和EfficientNet架构,在ImageNet、CUB-200和CIFAR-10三个数据集上验证了该方法在识别分布外样本、虚假模型行为和数据质量问题方面的有效性。代码已开源:https://github.com/maxdreyer/pcx。