Deep learning models are used in critical applications, in which mistakes can have serious consequences. Therefore, it is crucial to understand how and why models generate predictions. This understanding provides useful information to check whether the model is learning the right patterns, detect biases in the data, improve model design, and build systems that can be trusted. This work proposes a new method for interpreting Convolutional Neural Networks in image classification tasks. The approach works by selecting the most important feature maps that contribute to each prediction. To solve this combinatorial problem, we encode it into a quantum constrained optimization problem and propose to solve it using quantum annealing. We evaluate our method against the state-of-the-art explainable AI techniques, specifically GradCAM and GradCAM++, and observe an improved class disentanglement, i.e. the model's decision boundaries become more distinct and its reasoning more transparent. This demonstrates that our approach enhances the quality of explanations, making it easier to understand which features the model relies on for specific predictions. In addition, we study the computational behavior of the quantum annealing algorithm. Specifically, we analyze the minimum energy gap of the system during computation and the probability that the algorithm finds the correct solution. These analyses provide theoretical insight into why the method works effectively in practice.
翻译:深度学习模型被用于关键性应用中,其错误可能导致严重后果。因此,理解模型如何及为何生成预测至关重要。这种理解有助于检验模型是否习得正确模式、检测数据中的偏差、改进模型设计,并构建可信赖的系统。本文提出一种新的图像分类任务中卷积神经网络解释方法。该方法通过选取对每个预测贡献最大的关键特征图来实现,我们将其编码为量子约束优化问题,并采用量子退火技术求解。与现有最先进的可解释人工智能技术(特别是GradCAM和GradCAM++)对比评估表明,本文方法实现了更优的类别解耦——即模型决策边界更清晰,推理过程更透明。这说明我们的方法能提升解释质量,更易理解模型在特定预测中依赖的特征。此外,我们研究了量子退火算法的计算行为,重点分析了计算过程中系统的最小能隙以及算法找到正确解的概率,这些分析从理论上揭示了该方法在实践中高效运作的深层原因。