The increase of legislative concerns towards the usage of Artificial Intelligence (AI) has recently led to a series of regulations striving for a more transparent, trustworthy and accountable AI. Along with these proposals, the field of Explainable AI (XAI) has seen a rapid growth but the usage of its techniques has at times led to unexpected results. The robustness of the approaches is, in fact, a key property often overlooked: it is necessary to evaluate the stability of an explanation (to random and adversarial perturbations) to ensure that the results are trustable. To this end, we propose a test to evaluate the robustness to non-adversarial perturbations and an ensemble approach to analyse more in depth the robustness of XAI methods applied to neural networks and tabular datasets. We will show how leveraging manifold hypothesis and ensemble approaches can be beneficial to an in-depth analysis of the robustness.
翻译:人工智能(AI)使用引发的立法关切日益增加,近期催生了一系列旨在推动AI更加透明、可信与可问责的法规。伴随这些提案,可解释人工智能(XAI)领域迅速发展,但其技术的应用有时会导致意外结果。事实上,方法的稳健性这一关键特性常被忽视:必须评估解释对随机扰动和对抗扰动的稳定性,以确保结果可信。为此,我们提出一种测试来评估对非对抗扰动的稳健性,并采用集成方法更深入地分析应用于神经网络和表格数据集的XAI方法的稳健性。我们将展示如何利用流形假设和集成方法有助于对稳健性进行深入分析。