To better understand the output of deep neural networks (DNN), attribution based methods have been an important approach for model interpretability, which assign a score for each input dimension to indicate its importance towards the model outcome. Notably, the attribution methods use the axioms of sensitivity and implementation invariance to ensure the validity and reliability of attribution results. Yet, the existing attribution methods present challenges for effective interpretation and efficient computation. In this work, we introduce MFABA, an attribution algorithm that adheres to axioms, as a novel method for interpreting DNN. Additionally, we provide the theoretical proof and in-depth analysis for MFABA algorithm, and conduct a large scale experiment. The results demonstrate its superiority by achieving over 101.5142 times faster speed than the state-of-the-art attribution algorithms. The effectiveness of MFABA is thoroughly evaluated through the statistical analysis in comparison to other methods, and the full implementation package is open-source at: https://github.com/LMBTough/MFABA
翻译:为了更好地理解深度神经网络(DNN)的输出,基于归因的方法已成为模型可解释性的重要途径,该方法为每个输入维度分配一个分数,以指示其对模型结果的重要性。值得注意的是,归因方法利用敏感性和实现不变性公理来确保归因结果的有效性和可靠性。然而,现有的归因方法在有效解释和高效计算方面仍面临挑战。在本工作中,我们提出MFABA,一种遵循公理的归因算法,作为解释DNN的新方法。此外,我们提供了MFABA算法的理论证明和深入分析,并进行了大规模实验。结果表明,其速度比最先进的归因算法快101.5142倍以上,展现出优越性。通过与其他方法的统计分析,我们全面评估了MFABA的有效性,完整实现包已开源至:https://github.com/LMBTough/MFABA