Score-based explainable machine-learning techniques are often used to understand the logic behind black-box models. However, such explanation techniques are often computationally expensive, which limits their application in time-critical contexts. Therefore, we propose and investigate the use of computationally less costly regression models for approximating the output of score-based explanation techniques, such as SHAP. Moreover, validity guarantees for the approximated values are provided by the employed inductive conformal prediction framework. We propose several non-conformity measures designed to take the difficulty of approximating the explanations into account while keeping the computational cost low. We present results from a large-scale empirical investigation, in which the approximate explanations generated by our proposed models are evaluated with respect to efficiency (interval size). The results indicate that the proposed method can significantly improve execution time compared to the fast version of SHAP, TreeSHAP. The results also suggest that the proposed method can produce tight intervals, while providing validity guarantees. Moreover, the proposed approach allows for comparing explanations of different approximation methods and selecting a method based on how informative (tight) are the predicted intervals.
翻译:基于分数的可解释机器学习技术常用于理解黑箱模型背后的逻辑。然而,这类解释技术通常计算成本高昂,限制了其在时间敏感场景中的应用。为此,我们提出并研究了使用计算成本较低的回归模型来近似SHAP等基于分数的解释技术输出的方法。此外,通过引入归纳保形预测框架,我们为近似值提供了有效性保证。我们提出了多种非一致性度量方法,在保持较低计算成本的同时,将解释近似难度纳入考量。通过大规模实证研究,我们评估了所提模型生成的近似解释在效率(区间宽度)方面的表现。结果表明,与SHAP的快速版本TreeSHAP相比,所提方法可显著提升执行效率。同时,该方法在提供有效性保证的前提下能够产生紧凑的预测区间。此外,该框架还可用于比较不同近似方法产生的解释,并根据预测区间的信息量(紧凑性)进行方法选择。