The notion of robustness in XAI refers to the observed variations in the explanation of the prediction of a learned model with respect to changes in the input leading to that prediction. Intuitively, if the input being explained is modified slightly subtly enough so as to not change the prediction of the model too much, then we would expect that the explanation provided for that new input does not change much either. We argue that a combination through discriminative averaging of ensembles weak learners explanations can improve the robustness of explanations in ensemble methods.This approach has been implemented and tested with post-hoc SHAP method and Random Forest ensemble with successful results. The improvements obtained have been measured quantitatively and some insights into the explicability robustness in ensemble methods are presented.
翻译:可解释人工智能(XAI)中的鲁棒性概念,是指学习模型预测结果解释相对于输入变化所呈现的观测变异。直观而言,当被解释的输入发生细微改变(且这种改变不足以显著改变模型预测结果)时,我们期望新输入对应的解释也不应发生大幅变化。本文论证,通过对集成学习中弱学习器解释进行判别性平均组合,可提升集成方法中解释的鲁棒性。该方法已基于事后SHAP方法与随机森林集成模型进行实现与测试,并取得了成功结果。本文对所获改进进行了定量评估,并揭示了集成方法中可解释鲁棒性的若干重要见解。