Approaches for appraising feature importance approximations, alternatively referred to as attribution methods, have been established across an extensive array of contexts. The development of resilient techniques for performance benchmarking constitutes a critical concern in the sphere of explainable deep learning. This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates. The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks, contradicting the original intent of ROAR. This occurrence is similarly observed in the recently introduced variant RemOve-And-Debias (ROAD), and we posit a persistent pattern of blurriness bias in ROAR attribution metrics. Our findings serve as a warning against indiscriminate use on ROAR metrics. The code is available as open source.
翻译:用于评估特征重要性近似方法(亦称归因方法)的方法已在广泛场景中得到确立。开发稳健的性能基准测试技术是可解释深度学习领域的关键问题。本研究审视了RemOve-And-Retrain(ROAR)流程的可靠性——该流程常用于衡量特征重要性评估的性能。基于理论基础与实证研究获得的洞见表明,包含决策函数信息较少的归因可能在ROAR基准测试中产生更优结果,这与ROAR的原始意图相矛盾。这种现象在近期提出的变体RemOve-And-Debias(ROAD)中同样被观察到,我们提出ROAR归因指标存在一种持续性模糊偏差模式。我们的发现对无差别使用ROAR指标敲响警钟。相关代码已开源发布。