Efforts to decode deep neural networks (DNNs) often involve mapping their predictions back to the input features. Among these methods, Integrated Gradients (IG) has emerged as a significant technique. The selection of appropriate baselines in IG is crucial for crafting meaningful and unbiased explanations of model predictions in diverse settings. The standard approach of utilizing a single baseline, however, is frequently inadequate, prompting the need for multiple baselines. Leveraging the natural link between IG and the Aumann-Shapley Value, we provide a novel outlook on baseline design. Theoretically, we demonstrate that under certain assumptions, a collection of baselines aligns with the coalitions described by the Shapley Value. Building on this insight, we develop a new baseline method called Shapley Integrated Gradients (SIG), which uses proportional sampling to mirror the Shapley Value computation process. Simulations conducted in GridWorld validate that SIG effectively emulates the distribution of Shapley Values. Moreover, empirical tests on various image processing tasks show that SIG surpasses traditional IG baseline methods by offering more precise estimates of feature contributions, providing consistent explanations across different applications, and ensuring adaptability to diverse data types with negligible additional computational demand.
翻译:解码深度神经网络(DNNs)的工作常涉及将其预测结果映射回输入特征。在这些方法中,集成梯度(IG)已成为一项重要技术。在不同场景下,为模型预测构建有意义且无偏的解释时,IG中基线的恰当选择至关重要。然而,使用单一基线的标准方法往往不足,这促使我们需要采用多个基线。利用IG与Aumann-Shapley值之间的天然联系,我们为基线设计提供了一个新颖视角。理论上,我们证明在特定假设下,一组基线与Shapley值所描述的联盟结构相一致。基于这一洞见,我们开发了一种名为Shapley集成梯度(SIG)的新基线方法,该方法通过比例采样来模拟Shapley值的计算过程。在GridWorld中进行的仿真验证了SIG能有效模拟Shapley值的分布。此外,在多种图像处理任务上的实证测试表明,SIG超越了传统IG基线方法,能更精确地估计特征贡献,在不同应用中提供一致的解释,并确保对多样数据类型的适应性,同时仅带来可忽略的额外计算需求。