Counterfactual Explanations (CEs) are an important tool in Algorithmic Recourse for addressing two questions: 1. What are the crucial factors that led to an automated prediction/decision? 2. How can these factors be changed to achieve a more favorable outcome from a user's perspective? Thus, guiding the user's interaction with AI systems by proposing easy-to-understand explanations and easy-to-attain feasible changes is essential for the trustworthy adoption and long-term acceptance of AI systems. In the literature, various methods have been proposed to generate CEs, and different quality measures have been suggested to evaluate these methods. However, the generation of CEs is usually computationally expensive, and the resulting suggestions are unrealistic and thus non-actionable. In this paper, we introduce a new method to generate CEs for a pre-trained binary classifier by first shaping the latent space of an autoencoder to be a mixture of Gaussian distributions. CEs are then generated in latent space by linear interpolation between the query sample and the centroid of the target class. We show that our method maintains the characteristics of the input sample during the counterfactual search. In various experiments, we show that the proposed method is competitive based on different quality measures on image and tabular datasets -- efficiently returns results that are closer to the original data manifold compared to three state-of-the-art methods, which are essential for realistic high-dimensional machine learning applications.
翻译:反事实解释(CEs)是算法回溯中的重要工具,用于解决两个问题:1. 哪些关键因素导致了自动化预测/决策?2. 如何改变这些因素以从用户角度获得更有利的结果?因此,通过提出易于理解的解释和易于实现的可行改变来指导用户与AI系统的交互,对于AI系统的可信采纳和长期接受至关重要。文献中已提出多种生成CEs的方法,并建议了不同的质量度量来评估这些方法。然而,CEs的生成通常计算成本高昂,且产生的建议不切实际,因而难以执行。本文提出了一种新方法,通过首先将自编码器的潜空间塑造为高斯混合分布,为预训练的二元分类器生成CEs。随后,通过查询样本与目标类质心之间的线性插值在潜空间中生成CEs。我们展示了该方法在反事实搜索过程中保持了输入样本的特征。在多种实验中,我们证明了所提方法在图像和表格数据集上基于不同质量度量的竞争力——相较于三种最先进方法,其能高效返回更接近原始数据流形的结果,这对于现实高维机器学习应用至关重要。