Counterfactual Explanations (CEs) help address the question: How can the factors that influence the prediction of a predictive model be changed to achieve a more favorable outcome from a user's perspective? Thus, they bear the potential to guide the user's interaction with AI systems since they represent easy-to-understand explanations. To be applicable, CEs need to be realistic and actionable. In the literature, various methods have been proposed to generate CEs. However, the majority of research on CEs focuses on classification problems where questions like ``What should I do to get my rejected loan approved?" are raised. In practice, answering questions like ``What should I do to increase my salary?" are of a more regressive nature. In this paper, we introduce a novel method to generate CEs for a pre-trained regressor by first disentangling the label-relevant from the label-irrelevant dimensions in the latent space. CEs are then generated by combining the label-irrelevant dimensions and the predefined output. The intuition behind this approach is that the ideal counterfactual search should focus on the label-irrelevant characteristics of the input and suggest changes toward target-relevant characteristics. Searching in the latent space could help achieve this goal. We show that our method maintains the characteristics of the query sample during the counterfactual search. In various experiments, we demonstrate that the proposed method is competitive based on different quality measures on image and tabular datasets in regression problem settings. It efficiently returns results closer to the original data manifold compared to three state-of-the-art methods, which is essential for realistic high-dimensional machine learning applications. Our code will be made available as an open-source package upon the publication of this work.
翻译:反事实解释(CEs)有助于回答以下问题:如何改变影响预测模型的因素,从而从用户角度获得更有利的结果?因此,它们具有引导用户与AI系统交互的潜力,因为其代表了易于理解的解释。为具备实用性,CEs需要具有现实性和可操作性。现有文献已提出多种生成CEs的方法,但多数研究聚焦于分类问题(例如“如何让被拒的贷款申请获得批准?”这类问题)。而实践中,“如何提高薪资?”这类问题本质更具回归特性。本文提出一种新方法,通过先解耦隐空间中标签相关与标签无关的维度,为预训练回归器生成CEs。具体而言,通过组合标签无关维度与预设输出生成CEs。该方法的核心直觉是:理想的反事实搜索应关注输入的标签无关特征,并引导向目标相关特征的变化,而隐空间搜索有助于实现这一目标。我们证明该方法能在反事实搜索过程中保留查询样本的特征。通过多组实验,我们展示了该方法在回归问题场景下,基于图像和表格数据集的多种质量评估指标中具有竞争力。与三种前沿方法相比,该方法能高效返回更接近原始数据流形的结果,这对高维现实机器学习应用至关重要。相关代码将在论文发表后以开源包形式公布。