Rule-based surrogate models are an effective and interpretable way to approximate a Deep Neural Network's (DNN) decision boundaries, allowing humans to easily understand deep learning models. Current state-of-the-art decompositional methods, which are those that consider the DNN's latent space to extract more exact rule sets, manage to derive rule sets at high accuracy. However, they a) do not guarantee that the surrogate model has learned from the same variables as the DNN (alignment), b) only allow to optimise for a single objective, such as accuracy, which can result in excessively large rule sets (complexity), and c) use decision tree algorithms as intermediate models, which can result in different explanations for the same DNN (stability). This paper introduces the CGX (Column Generation eXplainer) to address these limitations - a decompositional method using dual linear programming to extract rules from the hidden representations of the DNN. This approach allows to optimise for any number of objectives and empowers users to tweak the explanation model to their needs. We evaluate our results on a wide variety of tasks and show that CGX meets all three criteria, by having exact reproducibility of the explanation model that guarantees stability and reduces the rule set size by >80% (complexity) at equivalent or improved accuracy and fidelity across tasks (alignment).
翻译:规则型代理模型是一种高效且可解释的方法,用于近似深度神经网络(DNN)的决策边界,使人类能够轻松理解深度学习模型。当前最先进的分解方法(即考虑DNN潜在空间以提取更精确规则集的方法)能够以高精度导出规则集。然而,这类方法存在以下问题:a)不能保证代理模型与DNN从相同变量中学习(对齐性);b)仅允许优化单一目标(如准确率),可能导致规则集规模过大(复杂度);c)使用决策树算法作为中间模型,可能导致同一DNN产生不同解释(稳定性)。本文提出CGX(列生成解释器)以解决上述局限——这是一种利用对偶线性规划从DNN隐藏表示中提取规则的分解方法。该方法可针对任意数量的目标进行优化,并允许用户根据需求调整解释模型。我们在多种任务上评估结果,表明CGX满足所有三个标准:解释模型具有完全可重复性,保证了稳定性;在跨任务中保持同等或更优准确率与保真度(对齐性)的前提下,规则集规模缩减超80%(复杂度)。