Existing explanation tools for image classifiers usually give only one single explanation for an image. For many images, however, both humans and image classifiers accept more than one explanation for the image label. Thus, restricting the number of explanations to just one severely limits the insight into the behavior of the classifier. In this paper, we describe an algorithm and a tool, REX, for computing multiple explanations of the output of a black-box image classifier for a given image. Our algorithm uses a principled approach based on causal theory. We analyse its theoretical complexity and provide experimental results showing that REX finds multiple explanations on 7 times more images than the previous work on the ImageNet-mini benchmark.
翻译:现有针对图像分类器的解释工具通常只为单张图像提供单一解释。然而,对于许多图像而言,人类和图像分类器均可接受多种对该图像标签的解释。因此,将解释数量限制为一种将严重制约对分类器行为理解的深度。本文描述了一种算法及配套工具REX,可用于计算给定图像的黑盒图像分类器输出的多重解释。该算法采用基于因果理论的严谨方法,我们分析了其理论复杂度,并提供了实验结果证明:在ImageNet-mini基准测试中,REX在七倍于此前工作的图像上发现了多重解释。