Using Logic Programming and Kernel-Grouping for Improving Interpretability of Convolutional Neural Networks

Within the realm of deep learning, the interpretability of Convolutional Neural Networks (CNNs), particularly in the context of image classification tasks, remains a formidable challenge. To this end we present a neurosymbolic framework, NeSyFOLD-G that generates a symbolic rule-set using the last layer kernels of the CNN to make its underlying knowledge interpretable. What makes NeSyFOLD-G different from other similar frameworks is that we first find groups of similar kernels in the CNN (kernel-grouping) using the cosine-similarity between the feature maps generated by various kernels. Once such kernel groups are found, we binarize each kernel group's output in the CNN and use it to generate a binarization table which serves as input data to FOLD-SE-M which is a Rule Based Machine Learning (RBML) algorithm. FOLD-SE-M then generates a rule-set that can be used to make predictions. We present a novel kernel grouping algorithm and show that grouping similar kernels leads to a significant reduction in the size of the rule-set generated by FOLD-SE-M, consequently, improving the interpretability. This rule-set symbolically encapsulates the connectionist knowledge of the trained CNN. The rule-set can be viewed as a normal logic program wherein each predicate's truth value depends on a kernel group in the CNN. Each predicate in the rule-set is mapped to a concept using a few semantic segmentation masks of the images used for training, to make it human-understandable. The last layers of the CNN can then be replaced by this rule-set to obtain the NeSy-G model which can then be used for the image classification task. The goal directed ASP system s(CASP) can be used to obtain the justification of any prediction made using the NeSy-G model. We also propose a novel algorithm for labeling each predicate in the rule-set with the semantic concept(s) that its corresponding kernel group represents.

翻译：在深度学习领域，卷积神经网络（CNN）在图像分类任务中的可解释性仍然是一个严峻挑战。为此，我们提出了一种神经符号框架NeSyFOLD-G，该框架利用CNN最后几层的内核生成符号规则集，以使其底层知识具有可解释性。NeSyFOLD-G与其他类似框架的不同之处在于，我们首先通过计算不同内核生成的特征图之间的余弦相似度，在CNN中找出相似内核的组（核分组）。找到这些核分组后，我们对CNN中每个核分组的输出进行二值化，并以此生成二值化表，作为基于规则的机器学习（RBML）算法FOLD-SE-M的输入数据。随后，FOLD-SE-M生成可用于预测的规则集。我们提出了一种新颖的核分组算法，并表明对相似内核进行分组可显著减少FOLD-SE-M生成的规则集规模，从而提升可解释性。该规则集以符号形式封装了已训练CNN的连接主义知识。该规则集可视为一个正规逻辑程序，其中每个谓词的真值取决于CNN中的一个核分组。规则集中的每个谓词通过使用训练图像中的若干语义分割掩码映射到一个概念，从而使其对人类可理解。随后，CNN的最后几层可由该规则集替换，得到NeSy-G模型，该模型可用于图像分类任务。目标导向的ASP系统s(CASP)可用于获取使用NeSy-G模型所做任何预测的推理解释。我们还提出了一种新颖算法，用于将规则集中的每个谓词标记为其对应核分组所表示的语义概念。