The Shapley value is widely regarded as a trustworthy attribution metric. However, when people use Shapley values to explain the attribution of input variables of a deep neural network (DNN), it usually requires a very high computational cost to approximate relatively accurate Shapley values in real-world applications. Therefore, we propose a novel network architecture, the HarsanyiNet, which makes inferences on the input sample and simultaneously computes the exact Shapley values of the input variables in a single forward propagation. The HarsanyiNet is designed on the theoretical foundation that the Shapley value can be reformulated as the redistribution of Harsanyi interactions encoded by the network.
翻译:Shapley值被广泛视为一种可信的归因度量。然而,当人们使用Shapley值解释深度神经网络(DNN)输入变量的归因时,在实际应用中通常需要极高的计算成本来近似获得相对准确的Shapley值。为此,我们提出了一种新型网络架构——HarsanyiNet,该网络在对输入样本进行推理的同时,通过一次前向传播即可精确计算出输入变量的Shapley值。HarsanyiNet的设计基于这样的理论基础:Shapley值可被重新表述为网络编码的Harsanyi交互作用的再分配。