The number of local model-agnostic explanation techniques proposed has grown rapidly recently. One main reason is that the bar for developing new explainability techniques is low due to the lack of optimal evaluation measures. Without rigorous measures, it is hard to have concrete evidence of whether the new explanation techniques can significantly outperform their predecessors. Our study proposes a new taxonomy for evaluating local explanations: robustness, evaluation using ground truth from synthetic datasets and interpretable models, model randomization, and human-grounded evaluation. Using this proposed taxonomy, we highlight that all categories of evaluation methods, except those based on the ground truth from interpretable models, suffer from a problem we call the "blame problem." In our study, we argue that this category of evaluation measure is a more reasonable method for evaluating local model-agnostic explanations. However, we show that even this category of evaluation measures has further limitations. The evaluation of local explanations remains an open research problem.
翻译:近年来,局部模型无关解释技术的数量快速增长。主要原因之一是缺乏最优评估指标,导致开发新可解释性技术的门槛较低。没有严格的评估指标,很难有确凿证据表明新的解释技术是否显著优于现有方法。本研究提出了一种评估局部解释的新分类体系:鲁棒性、基于合成数据集和可解释模型的真实标签评估、模型随机化及人类基础评估。通过这一分类体系,我们指出除基于可解释模型真实标签的方法外,所有评估类别都存在我们称之为"责任问题"的缺陷。本研究认为,基于可解释模型真实标签的评估指标是评估局部模型无关解释的更合理方法。然而,我们证明即使这类评估指标也存在进一步限制。局部解释的评估仍是一个开放的研究问题。