Computing SHAP Efficiently Using Model Structure Information

SHAP (SHapley Additive exPlanations) has become a popular method to attribute the prediction of a machine learning model on an input to its features. One main challenge of SHAP is the computation time. An exact computation of Shapley values requires exponential time complexity. Therefore, many approximation methods are proposed in the literature. In this paper, we propose methods that can compute SHAP exactly in polynomial time or even faster for SHAP definitions that satisfy our additivity and dummy assumptions (eg, kernal SHAP and baseline SHAP). We develop different strategies for models with different levels of model structure information: known functional decomposition, known order of model (defined as highest order of interaction in the model), or unknown order. For the first case, we demonstrate an additive property and a way to compute SHAP from the lower-order functional components. For the second case, we derive formulas that can compute SHAP in polynomial time. Both methods yield exact SHAP results. Finally, if even the order of model is unknown, we propose an iterative way to approximate Shapley values. The three methods we propose are computationally efficient when the order of model is not high which is typically the case in practice. We compare with sampling approach proposed in Castor & Gomez (2008) using simulation studies to demonstrate the efficacy of our proposed methods.

翻译：SHAP（SHapley Additive exPlanations）已成为将机器学习模型对输入的预测归因于其特征的流行方法。SHAP的主要挑战之一是计算时间。精确计算Shapley值需要指数级时间复杂度。因此，文献中提出了许多近似方法。在本文中，我们提出了一些方法，针对满足可加性和虚拟变量假设（例如，kernal SHAP和baseline SHAP）的SHAP定义，能够以多项式时间甚至更快的速度精确计算SHAP。我们针对具有不同级别模型结构信息的模型开发了不同策略：已知功能分解、已知模型阶数（定义为模型中交互作用的最高阶）或未知阶数。对于第一种情况，我们证明了一个可加性性质以及一种从低阶功能组件计算SHAP的方法。对于第二种情况，我们推导出能够以多项式时间计算SHAP的公式。这两种方法都产生精确的SHAP结果。最后，如果连模型阶数也未知，我们提出了一种迭代方法来近似Shapley值。当模型阶数不高时（这在实践中通常是典型情况），我们提出的这三种方法计算效率较高。我们通过仿真研究，与Castor & Gomez（2008）提出的采样方法进行比较，以证明我们提出方法的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/