Text-to-Image Models for Counterfactual Explanations: a Black-Box Approach

This paper addresses the challenge of generating Counterfactual Explanations (CEs), involving the identification and modification of the fewest necessary features to alter a classifier's prediction for a given image. Our proposed method, Text-to-Image Models for Counterfactual Explanations (TIME), is a black-box counterfactual technique based on distillation. Unlike previous methods, this approach requires solely the image and its prediction, omitting the need for the classifier's structure, parameters, or gradients. Before generating the counterfactuals, TIME introduces two distinct biases into Stable Diffusion in the form of textual embeddings: the context bias, associated with the image's structure, and the class bias, linked to class-specific features learned by the target classifier. After learning these biases, we find the optimal latent code applying the classifier's predicted class token and regenerate the image using the target embedding as conditioning, producing the counterfactual explanation. Extensive empirical studies validate that TIME can generate explanations of comparable effectiveness even when operating within a black-box setting.

翻译：本文解决了生成反事实解释的挑战，涉及识别并修改最少数量的必要特征以改变分类器对给定图像的预测。我们提出的方法——基于文本到图像模型的反事实解释（TIME），是一种基于蒸馏的黑盒反事实技术。与先前方法不同，本方法仅需图像及其预测结果，无需分类器的结构、参数或梯度。在生成反事实之前，TIME以文本嵌入的形式向Stable Diffusion引入两类不同的偏差：与图像结构相关的上下文偏差，以及与目标分类器学习到的类别特定特征相关的类别偏差。学习这些偏差后，我们通过应用分类器预测的类别标记找到最优潜在编码，并以目标嵌入作为条件重新生成图像，从而产生反事实解释。广泛的实证研究验证了TIME在黑盒设置下也能生成效果相当的解释。

相关内容

黑盒

关注 0

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日