Deep learning models are powerful image classifiers but their opacity hinders their trustworthiness. Explanation methods for capturing the reasoning process within these classifiers faithfully and in a clear manner are scarce, due to their sheer complexity and size. We provide a solution for this problem by defining a novel method for explaining the outputs of image classifiers with debates between two agents, each arguing for a particular class. We obtain these debates as concrete instances of Free Argumentative eXchanges (FAXs), a novel argumentation-based multi-agent framework allowing agents to internalise opinions by other agents differently than originally stated. We define two metrics (consensus and persuasion rate) to assess the usefulness of FAXs as argumentative explanations for image classifiers. We then conduct a number of empirical experiments showing that FAXs perform well along these metrics as well as being more faithful to the image classifiers than conventional, non-argumentative explanation methods. All our implementations can be found at https://github.com/koriavinash1/FAX.
翻译:深度学习模型是强大的图像分类器,但其不透明性阻碍了可信度建立。由于模型结构极其复杂且规模庞大,能够忠实且清晰捕捉分类器内部推理过程的解释方法十分稀缺。我们通过定义一种新颖的解释方法来解决该问题:该方法利用两个智能体(各自为特定类别辩护)之间的辩论来解释图像分类器的输出。我们将这些辩论实例化为自由论辩交换(FAX)的具体实现——这是一种基于论证的新型多智能体框架,允许智能体以不同于原始表述的方式内化其他智能体的观点。我们定义了两个评估指标(共识度与说服率)来衡量FAX作为图像分类器论证性解释的有效性。随后通过一系列实证实验表明,FAX在这些指标上表现优异,且相较于传统的非论证性解释方法,能更忠实反映图像分类器的决策过程。全部实现代码可在https://github.com/koriavinash1/FAX获取。