Although many methods have been proposed to estimate attributions of input variables, there still exists a significant theoretical flaw in masking-based attribution methods, i.e., it is hard to examine whether the masking method faithfully represents the absence of input variables. Specifically, for masking-based attributions, setting an input variable to the baseline value is a typical way of representing the absence of the variable. However, there are no studies investigating how to represent the absence of input variables and verify the faithfulness of baseline values. Therefore, we revisit the feature representation of a DNN in terms of causality, and propose to use causal patterns to examine whether the masking method faithfully removes information encoded in input variables. More crucially, it is proven that the causality can be explained as the elementary rationale of the Shapley value. Furthermore, we define the optimal baseline value from the perspective of causality, and we propose a method to learn the optimal baseline value. Experimental results have demonstrated the effectiveness of our method.
翻译:尽管已有许多方法用于估计输入变量的归因,但基于掩蔽的归因方法仍存在一个显著的理论缺陷,即难以检验掩蔽方法能否忠实地表示输入变量的缺失。具体而言,在基于掩蔽的归因中,将输入变量设为基线值是表示变量缺失的典型方式。然而,目前尚无研究探讨如何表示输入变量的缺失以及验证基线值的忠实性。因此,我们从因果性角度重新审视深度神经网络的特征表示,并提出利用因果模式来检验掩蔽方法是否忠实地移除了输入变量中编码的信息。更关键的是,我们证明了因果性可以解释为Shapley值的基本原理。此外,我们从因果性视角定义了最优基线值,并提出了一种学习最优基线值的方法。实验结果验证了所提方法的有效性。