Deep neural networks (DNNs) are increasingly being used as controllers in reactive systems. However, DNNs are highly opaque, which renders it difficult to explain and justify their actions. To mitigate this issue, there has been a surge of interest in explainable AI (XAI) techniques, capable of pinpointing the input features that caused the DNN to act as it did. Existing XAI techniques typically face two limitations: (i) they are heuristic, and do not provide formal guarantees that the explanations are correct; and (ii) they often apply to ``one-shot'' systems, where the DNN is invoked independently of past invocations, as opposed to reactive systems. Here, we begin bridging this gap, and propose a formal DNN-verification-based XAI technique for reasoning about multi-step, reactive systems. We suggest methods for efficiently calculating succinct explanations, by exploiting the system's transition constraints in order to curtail the search space explored by the underlying verifier. We evaluate our approach on two popular benchmarks from the domain of automated navigation; and observe that our methods allow the efficient computation of minimal and minimum explanations, significantly outperforming the state of the art. We also demonstrate that our methods produce formal explanations that are more reliable than competing, non-verification-based XAI techniques.
翻译:深度神经网络(DNN)正越来越多地被用作反应式系统中的控制器。然而,DNN的高度不透明性使其行为难以解释和论证。为缓解这一问题,可解释人工智能(XAI)技术日益受到关注,该类技术能够精准定位导致DNN做出特定决策的输入特征。现有XAI技术通常面临两个局限:(i)它们基于启发式方法,无法对解释的正确性提供形式化保证;(ii)它们通常适用于“单次”系统(即DNN独立于先前调用运行),而非反应式系统。本文开始弥合这一差距,提出一种基于DNN形式化验证的XAI技术,用于推理多步反应式系统。我们通过利用系统转移约束来缩减底层验证器探索的搜索空间,提出了高效计算简洁解释的方法。我们在自动导航领域的两个经典基准测试上评估了该方法,结果表明,我们的方法能够高效计算最小解释和极小解释,显著优于现有技术。我们还证明,与基于非验证的竞争性XAI技术相比,我们的方法产生的形式化解释具有更高的可靠性。