There has been a surge in Explainable-AI (XAI) methods that provide insights into the workings of Deep Neural Network (DNN) models. Integrated Gradients (IG) is a popular XAI algorithm that attributes relevance scores to input features commensurate with their contribution to the model's output. However, it requires multiple forward \& backward passes through the model. Thus, compared to a single forward-pass inference, there is a significant computational overhead to generate the explanation which hinders real-time XAI. This work addresses the aforementioned issue by accelerating IG with a hardware-aware algorithm optimization. We propose a novel non-uniform interpolation scheme to compute the IG attribution scores which replaces the baseline uniform interpolation. Our algorithm significantly reduces the total interpolation steps required without adversely impacting convergence. Experiments on the ImageNet dataset using a pre-trained InceptionV3 model demonstrate \textit{2.6-3.6}$\times$ performance speedup on GPU systems for iso-convergence. This includes the minimal \textit{0.2-3.2}\% latency overhead introduced by the pre-processing stage of computing the non-uniform interpolation step-sizes.
翻译:可解释人工智能(XAI)方法近期激增,这些方法为深度神经网络(DNN)模型的工作原理提供了洞察。积分梯度(IG)是一种流行的XAI算法,它将相关性分数分配给与模型输出贡献相称的输入特征。然而,它需要通过模型进行多次前向和反向传播。因此,与单次前向推理相比,生成解释存在显著的计算开销,这阻碍了实时XAI。本文通过硬件感知的算法优化加速IG来解决上述问题。我们提出了一种新颖的非均匀插值方案来计算IG归因分数,该方案取代了基线均匀插值。我们的算法显著减少了所需的总插值步骤,而不会对收敛产生不利影响。在ImageNet数据集上使用预训练的InceptionV3模型进行的实验表明,在GPU系统上,等收敛条件下的性能加速比为\textit{2.6-3.6}$\times$。这包括计算非均匀插值步长的预处理阶段引入的最小\textit{0.2-3.2}\%延迟开销。