Generating collective counterfactual explanations in score-based classification via mathematical optimization

from arxiv, This research has been funded in part by research projects EC H2020 MSCA RISE NeEDS (Grant agreement ID: 822214), FQM-329, P18-FR-2369 and US-1381178 (Junta de Andaluc\'{\i}a, Spain), and PID2019-110886RB-I00 and PID2022-137818OB-I00 (Ministerio de Ciencia, Innovaci\'on y Universidades, Spain). This support is gratefully acknowledged

Due to the increasing use of Machine Learning models in high stakes decision making settings, it has become increasingly important to have tools to understand how models arrive at decisions. Assuming a trained Supervised Classification model, explanations can be obtained via counterfactual analysis: a counterfactual explanation of an instance indicates how this instance should be minimally modified so that the perturbed instance is classified in the desired class by the Machine Learning classification model. Most of the Counterfactual Analysis literature focuses on the single-instance single-counterfactual setting, in which the analysis is done for one single instance to provide one single explanation. Taking a stakeholder's perspective, in this paper we introduce the so-called collective counterfactual explanations. By means of novel Mathematical Optimization models, we provide a counterfactual explanation for each instance in a group of interest, so that the total cost of the perturbations is minimized under some linking constraints. Making the process of constructing counterfactuals collective instead of individual enables us to detect the features that are critical to the entire dataset to have the individuals classified in the desired class. Our methodology allows for some instances to be treated individually, performing the collective counterfactual analysis for a fraction of records of the group of interest. This way, outliers are identified and handled appropriately. Under some assumptions on the classifier and the space in which counterfactuals are sought, finding collective counterfactuals is reduced to solving a convex quadratic linearly constrained mixed integer optimization problem, which, for datasets of moderate size, can be solved to optimality using existing solvers. The performance of our approach is illustrated on real-world datasets, demonstrating its usefulness.

翻译：随着机器学习模型在高风险决策场景中的日益广泛应用，理解模型如何得出决策的工具变得越来越重要。假设有一个训练好的监督分类模型，可以通过反事实分析获得解释：实例的反事实解释指明了如何对该实例进行最小程度的修改，使得修改后的实例被机器学习分类模型划分为目标类别。多数反事实分析文献聚焦于单实例单反事实场景，即针对单个实例进行分析以提供单一解释。本文从利益相关者视角出发，提出所谓的集体反事实解释。通过创新的数学优化模型，我们为关注群体中的每个实例提供反事实解释，使得在若干约束条件下扰动总成本最小化。将构建反事实的过程从个体化转为集体化，使我们能够检测出影响整个数据集个体被划分为目标类别的关键特征。我们的方法允许部分实例被单独处理，即针对关注群体中一定比例记录执行集体反事实分析，从而识别并恰当处理异常值。在分类器与反事实搜索空间满足特定假设的前提下，求解集体反事实可转化为求解带线性约束的凸二次混合整数优化问题，对于中等规模数据集，现有求解器可求得最优解。我们通过真实数据集验证了该方法的效果，证明了其实用价值。