Recent advancements in explainable machine learning provide effective and faithful solutions for interpreting model behaviors. However, many explanation methods encounter efficiency issues, which largely limit their deployments in practical scenarios. Real-time explainer (RTX) frameworks have thus been proposed to accelerate the model explanation process by learning a one-feed-forward explainer. Existing RTX frameworks typically build the explainer under the supervised learning paradigm, which requires large amounts of explanation labels as the ground truth. Considering that accurate explanation labels are usually hard to obtain due to constrained computational resources and limited human efforts, effective explainer training is still challenging in practice. In this work, we propose a COntrastive Real-Time eXplanation (CoRTX) framework to learn the explanation-oriented representation and relieve the intensive dependence of explainer training on explanation labels. Specifically, we design a synthetic strategy to select positive and negative instances for the learning of explanation. Theoretical analysis show that our selection strategy can benefit the contrastive learning process on explanation tasks. Experimental results on three real-world datasets further demonstrate the efficiency and efficacy of our proposed CoRTX framework.
翻译:近期可解释机器学习领域的进展为解释模型行为提供了有效且忠实的解决方案。然而,许多解释方法存在效率问题,这极大地限制了它们在实际场景中的部署。因此,实时解释器(RTX)框架被提出,通过学习一个前馈式解释器来加速模型解释过程。现有RTX框架通常基于监督学习范式构建解释器,需要大量解释标签作为真值。考虑到由于计算资源受限和人力投入有限,精确的解释标签通常难以获取,在实践中实现有效的解释器训练仍具有挑战性。本文提出对比式实时解释(CoRTX)框架,通过学习面向解释的表示来缓解解释器训练对解释标签的强依赖。具体而言,我们设计了一种合成策略来选取用于解释学习的正负实例。理论分析表明,我们的选取策略有利于解释任务上的对比学习过程。在三个真实数据集上的实验结果进一步证明了所提出的CoRTX框架的效率和有效性。