Prototypical Self-Explainable Models Without Re-training

Explainable AI (XAI) has unfolded in two distinct research directions with, on the one hand, post-hoc methods that explain the predictions of a pre-trained black-box model and, on the other hand, self-explainable models (SEMs) which are trained directly to provide explanations alongside their predictions. While the latter is preferred in most safety-critical scenarios, post-hoc approaches have received the majority of attention until now, owing to their simplicity and ability to explain base models without retraining. Current SEMs instead, require complex architectures and heavily regularized loss functions, thus necessitating specific and costly training. To address this shortcoming and facilitate wider use of SEMs, we propose a simple yet efficient universal method called KMEx (K-Means Explainer), which can convert any existing pre-trained model into a prototypical SEM. The motivation behind KMEx is to push towards more transparent deep learning-based decision-making via class-prototype-based explanations that are guaranteed to be diverse and trustworthy without retraining the base model. We compare models obtained from KMEx to state-of-the-art SEMs using an extensive qualitative evaluation to highlight the strengths and weaknesses of each model, further paving the way toward a more reliable and objective evaluation of SEMs.

翻译：可解释人工智能（XAI）已发展出两个截然不同的研究方向：其一是事后解释方法，用于解释预训练黑盒模型的预测结果；其二是自解释模型（SEMs），这类模型在训练时直接生成预测及其配套解释。尽管后者在大多数安全关键场景中更受青睐，但事后方法凭借其简单性以及无需重新训练即可解释基础模型的能力，迄今仍占据主流关注。而现有SEMs则需要复杂的架构和高度正则化的损失函数，因此必须进行特定且昂贵的训练。为解决这一缺陷并促进SEMs的更广泛使用，我们提出一种简单高效的通用方法KMEx（K-Means Explainer），该方法可将任意现有预训练模型转化为原型自解释模型。KMEx的设计动机在于通过基于类别原型的解释推动更透明的深度学习决策，这些解释无需重新训练基础模型即可保证多样性与可信度。我们通过大规模定性评估，将KMEx生成的模型与现有最先进SEMs进行对比，突出各模型的优缺点，进一步为SEMs更可靠、更客观的评估奠定基础。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/