Approaches for appraising feature importance approximations, alternatively referred to as attribution methods, have been established across an extensive array of contexts. The development of resilient techniques for performance benchmarking constitutes a critical concern in the sphere of explainable deep learning. This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates. The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks, contradicting the original intent of ROAR. This occurrence is similarly observed in the recently introduced variant RemOve-And-Debias (ROAD), and we posit a persistent pattern of blurriness bias in ROAR attribution metrics. Our findings serve as a warning against indiscriminate use on ROAR metrics.
翻译:评估特征重要性近似方法(亦称为归因方法)的评估框架已在广泛情境中建立。在可解释深度学习领域,开发稳健的性能基准测试技术是一个关键议题。本研究深入检验了RemOve-And-Retrain(ROAR)流程的可靠性——该流程被广泛用于评估特征重要性估计的性能。基于理论分析与实证研究,我们发现包含较少决策函数信息的归因结果可能在ROAR基准测试中表现出更优性能,这与ROAR的设计初衷相悖。此现象在近期提出的改进版本RemOve-And-Debias(ROAD)中同样存在,我们据此推断ROAR类归因度量存在系统性的模糊性偏差。本研究结论警示学界需审慎使用ROAR系列评估指标。