Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees

from arxiv, 24 pages, 11 figures. This work builds on an earlier manuscript (arXiv:2203.06043) and crucially extends it. Code is available at https://github.com/jvielhaben/MCD-XAI

The completeness axiom renders the explanation of a post-hoc XAI method only locally faithful to the model, i.e. for a single decision. For the trustworthy application of XAI, in particular for high-stake decisions, a more global model understanding is required. Recently, concept-based methods have been proposed, which are however not guaranteed to be bound to the actual model reasoning. To circumvent this problem, we propose Multi-dimensional Concept Discovery (MCD) as an extension of previous approaches that fulfills a completeness relation on the level of concepts. Our method starts from general linear subspaces as concepts and does neither require reinforcing concept interpretability nor re-training of model parts. We propose sparse subspace clustering to discover improved concepts and fully leverage the potential of multi-dimensional subspaces. MCD offers two complementary analysis tools for concepts in input space: (1) concept activation maps, that show where a concept is expressed within a sample, allowing for concept characterization through prototypical samples, and (2) concept relevance heatmaps, that decompose the model decision into concept contributions. Both tools together enable a detailed understanding of the model reasoning, which is guaranteed to relate to the model via a completeness relation. This paves the way towards more trustworthy concept-based XAI. We empirically demonstrate the superiority of MCD against more constrained concept definitions.

翻译：完备性公理使得事后可解释人工智能（XAI）方法对模型的解释仅在局部层面（即针对单一决策）具有保真性。为实现XAI的可信应用，特别是在高风险决策场景中，需要更全局的模型理解。近期提出的基于概念的方法虽被广泛应用，但无法保证其与模型实际推理过程的绑定关系。为解决该问题，我们提出多维度概念发现（MCD）作为现有方法的扩展，其在概念层面满足完备性关系。该方法以一般线性子空间作为概念起点，既无需强化概念可解释性，也无需重新训练模型组件。我们采用稀疏子空间聚类发现优化后的概念，并充分利用多维度子空间的潜力。MCD为输入空间中的概念提供两种互补分析工具：（1）概念激活图，用于展示概念在样本中的表达位置，支持通过原型样本进行概念表征；（2）概念相关性热力图，将模型决策分解为各概念的贡献。两种工具结合可实现对模型推理过程的精细理解，并通过完备性关系保证该理解与模型的实际关联。这为构建更可信的概念型XAI铺平了道路。实验表明，MCD相比于更具约束性的概念定义方法具有显著优越性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日