Counterfactuals are widely used to explain ML model predictions by providing alternative scenarios for obtaining the more desired predictions. They can be generated by a variety of methods that optimize different, sometimes conflicting, quality measures and produce quite different solutions. However, choosing the most appropriate explanation method and one of the generated counterfactuals is not an easy task. Instead of forcing the user to test many different explanation methods and analysing conflicting solutions, in this paper, we propose to use a multi-stage ensemble approach that will select single counterfactual based on the multiple-criteria analysis. It offers a compromise solution that scores well on several popular quality measures. This approach exploits the dominance relation and the ideal point decision aid method, which selects one counterfactual from the Pareto front. The conducted experiments demonstrated that the proposed approach generates fully actionable counterfactuals with attractive compromise values of the considered quality measures.
翻译:反事实解释通过提供获得更理想预测的替代场景,被广泛用于解释机器学习模型的预测结果。多种方法均可生成反事实解释,这些方法优化不同(有时相互冲突)的质量指标,并产生差异显著的解。然而,选择最合适的解释方法及生成的反事实解释并非易事。本文提出采用多阶段集成方法,基于多准则分析选择单一反事实解释,而非要求用户测试多种不同解释方法并分析相互冲突的解。该方法提供了在多个常用质量指标上表现均衡的折衷解。该方法利用支配关系与理想点决策辅助方法,从帕累托前沿中选择一个反事实解释。实验结果表明,所提方法生成的反事实解释具有完全可操作性,且在所考察的质量指标上具有理想的折衷值。