The lack of transparency in the decision-making processes of deep learning systems presents a significant challenge in modern artificial intelligence (AI), as it impairs users' ability to rely on and verify these systems. To address this challenge, Concept Bottleneck Models (CBMs) have made significant progress by incorporating human-interpretable concepts into deep learning architectures. This approach allows predictions to be traced back to specific concept patterns that users can understand and potentially intervene on. However, existing CBMs' task predictors are not fully interpretable, preventing a thorough analysis and any form of formal verification of their decision-making process prior to deployment, thereby raising significant reliability concerns. To bridge this gap, we introduce Concept-based Memory Reasoner (CMR), a novel CBM designed to provide a human-understandable and provably-verifiable task prediction process. Our approach is to model each task prediction as a neural selection mechanism over a memory of learnable logic rules, followed by a symbolic evaluation of the selected rule. The presence of an explicit memory and the symbolic evaluation allow domain experts to inspect and formally verify the validity of certain global properties of interest for the task prediction process. Experimental results demonstrate that CMR achieves comparable accuracy-interpretability trade-offs to state-of-the-art CBMs, discovers logic rules consistent with ground truths, allows for rule interventions, and allows pre-deployment verification.
翻译:深度学习系统决策过程缺乏透明度是现代人工智能(AI)面临的重要挑战,因为这削弱了用户依赖和验证这些系统的能力。为应对这一挑战,概念瓶颈模型(CBMs)通过将人类可解释的概念融入深度学习架构取得了显著进展。该方法使得预测能够追溯到用户可理解并可能干预的特定概念模式。然而,现有CBMs的任务预测器并非完全可解释,导致无法在部署前对其决策过程进行彻底分析及任何形式的正式验证,从而引发严重的可靠性问题。为弥补这一不足,我们提出基于概念的记忆推理器(CMR),这是一种新型CBM,旨在提供人类可理解且可证明验证的任务预测过程。我们的方法是将每个任务预测建模为对可学习逻辑规则记忆的神经选择机制,随后对选定规则进行符号化评估。显式记忆的存在与符号化评估使得领域专家能够检查并正式验证任务预测过程中特定全局性质的有效性。实验结果表明,CMR在准确性与可解释性的权衡方面达到了与最先进CBMs相当的水平,能够发现与真实情况一致的逻辑规则,支持规则干预,并允许进行部署前验证。