Deep neural networks (DNNs) are increasingly being used as controllers in reactive systems. However, DNNs are highly opaque, which renders it difficult to explain and justify their actions. To mitigate this issue, there has been a surge of interest in explainable AI (XAI) techniques, capable of pinpointing the input features that caused the DNN to act as it did. Existing XAI techniques typically face two limitations: (i) they are heuristic, and do not provide formal guarantees that the explanations are correct; and (ii) they often apply to ``one-shot'' systems, where the DNN is invoked independently of past invocations, as opposed to reactive systems. Here, we begin bridging this gap, and propose a formal DNN-verification-based XAI technique for reasoning about multi-step, reactive systems. We suggest methods for efficiently calculating succinct explanations, by exploiting the system's transition constraints in order to curtail the search space explored by the underlying verifier. We evaluate our approach on two popular benchmarks from the domain of automated navigation; and observe that our methods allow the efficient computation of minimal and minimum explanations, significantly outperforming the state of the art. We also demonstrate that our methods produce formal explanations that are more reliable than competing, non-verification-based XAI techniques.
翻译:深度神经网络(DNN)正越来越多地被用作反应式系统中的控制器。然而,DNN具有高度不透明性,这使得解释和证明其行为变得困难。为了缓解这一问题,可解释人工智能(XAI)技术引起了广泛关注,这类技术能够精确定位导致DNN做出特定决策的输入特征。现有的XAI技术通常面临两个局限:(i)它们基于启发式方法,无法提供解释正确的形式化保证;(ii)它们通常适用于“一次性”系统(即DNN调用独立于历史调用)而非反应式系统。本文开始弥合这一差距,提出一种基于DNN形式化验证的XAI技术,用于对多步骤反应式系统进行推理。我们提出了高效计算简洁解释的方法,通过利用系统的转移约束来缩小底层验证器搜索的空间。我们在自动导航领域的两个经典基准测试上评估了该方法,观察发现我们的方法能够高效计算最小和极小解释,显著优于现有技术。同时,我们证明该方法生成的正式解释比竞争性的非验证型XAI技术更可靠。