Deep Learning (DL) models are often black boxes, making their decision-making processes difficult to interpret. This lack of transparency has driven advancements in eXplainable Artificial Intelligence (XAI), a field dedicated to clarifying the reasoning behind DL model predictions. Among these, attribution-based methods such as LRP and GradCAM are widely used, though they rely on approximations that can be imprecise. To address these limitations, we introduce One Matrix to Explain Neural Networks (OMENN), a novel post-hoc method that represents a neural network as a single, interpretable matrix for each specific input. This matrix is constructed through a series of linear transformations that represent the processing of the input by each successive layer in the neural network. As a result, OMENN provides locally precise, attribution-based explanations of the input across various modern models, including ViTs and CNNs. We present a theoretical analysis of OMENN based on dynamic linearity property and validate its effectiveness with extensive tests on two XAI benchmarks, demonstrating that OMENN is competitive with state-of-the-art methods.
翻译:深度学习模型通常是黑盒,其决策过程难以解释。这种透明度的缺乏推动了可解释人工智能领域的进展,该领域致力于阐明深度学习模型预测背后的推理逻辑。其中,基于归因的方法如LRP和GradCAM被广泛使用,但它们依赖于可能不精确的近似。为应对这些局限,我们提出了一种新颖的事后解释方法——OMENN,它将神经网络表示为针对每个特定输入的单一可解释矩阵。该矩阵通过一系列线性变换构建,这些变换代表了输入在神经网络各连续层中的处理过程。因此,OMENN能够为包括ViT和CNN在内的多种现代模型提供局部精确的、基于归因的输入解释。我们基于动态线性特性对OMENN进行了理论分析,并在两个XAI基准测试上通过大量实验验证了其有效性,证明OMENN与当前最先进方法相比具有竞争力。