Approaches for appraising feature importance approximations, alternatively referred to as attribution methods, have been established across an extensive array of contexts. The development of resilient techniques for performance benchmarking constitutes a critical concern in the sphere of explainable deep learning. This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates. The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks, contradicting the original intent of ROAR. This occurrence is similarly observed in the recently introduced variant RemOve-And-Debias (ROAD), and we posit a persistent pattern of blurriness bias in ROAR attribution metrics. Our findings serve as a warning against indiscriminate use on ROAR metrics.
翻译:用于评估特征重要性近似方法(亦称归因方法)的框架已在众多场景中建立。开发稳健的性能基准评估技术是可解释深度学习领域的关键课题。本研究审视了广泛用于度量特征重要性估计性能的“移除与重训练”(RemOve-And-Retrain, ROAR)方法的可靠性。基于理论推导与实证分析,我们发现:包含决策函数信息量较低的归因方法反而可能在ROAR基准测试中表现更优,这与ROAR的原始设计意图相悖。这一现象同样存在于近期提出的变体“移除与去偏”(RemOve-And-Debias, ROAD)中,据此我们提出ROAR归因指标中普遍存在的模糊性偏差假说。研究结论警示学界应避免对ROAR指标的滥用。