Most methods in explainable AI (XAI) focus on providing reasons for the prediction of a given set of features. However, we solve an inverse explanation problem, i.e., given the deviation of a label, find the reasons of this deviation. We use a Bayesian framework to recover the ``true'' features, conditioned on the observed label value. We efficiently explain the deviation of a label value from the mode, by identifying and ranking the influential features using the ``distances'' in the ANOVA functional decomposition. We show that the new method is more human-intuitive and robust than methods based on mean values, e.g., SHapley Additive exPlanations (SHAP values). The extra costs of solving a Bayesian inverse problem are dimension-independent.
翻译:可解释人工智能(XAI)中的大多数方法侧重于为给定特征集的预测提供解释依据。然而,我们解决的是一个逆向解释问题:在给定标签偏离的情况下,寻找导致该偏离的原因。我们采用贝叶斯框架,在观测到标签值的条件下恢复“真实”特征。通过利用函数方差分析分解中的“距离”来识别和排序影响特征,我们有效解释了标签值相对于模态的偏离。研究表明,相较于基于均值的方法(如SHapley加性解释/SHAP值),新方法更具人类直觉性和鲁棒性。求解贝叶斯逆问题的额外计算成本与维度无关。