When individuals are subject to adverse outcomes from machine learning models, providing a recourse path to help achieve a positive outcome is desirable. Recent work has shown that counterfactual explanations - which can be used as a means of single-step recourse - are vulnerable to privacy issues, putting an individuals' privacy at risk. Providing a sequential multi-step path for recourse can amplify this risk. Furthermore, simply adding noise to recourse paths found from existing methods can impact the realism and actionability of the path for an end-user. In this work, we address privacy issues when generating realistic recourse paths based on instance-based counterfactual explanations, and provide PrivRecourse: an end-to-end privacy preserving pipeline that can provide realistic recourse paths. PrivRecourse uses differentially private (DP) clustering to represent non-overlapping subsets of the private dataset. These DP cluster centers are then used to generate recourse paths by forming a graph with cluster centers as the nodes, so that we can generate realistic - feasible and actionable - recourse paths. We empirically evaluate our approach on finance datasets and compare it to simply adding noise to data instances, and to using DP synthetic data, to generate the graph. We observe that PrivRecourse can provide paths that are private and realistic.
翻译:当个体受到机器学习模型的不利结果影响时,提供一条有助于实现积极结果的补救路径至关重要。近期研究表明,可作为单步补救手段的反事实解释存在隐私漏洞,可能危及个体隐私。而提供多步顺序补救路径会进一步放大这种风险。此外,直接向现有方法生成的补救路径添加噪声,会影响最终用户对路径的真实性与可操作性。本文针对基于实例的反事实解释生成真实补救路径过程中的隐私问题,提出PrivRecourse:一个端到端的隐私保护流水线,可生成真实补救路径。PrivRecourse利用差分隐私(DP)聚类表征私有数据集的非重叠子集,再以这些DP聚类中心为节点构建图,从而生成真实(即可行且可操作)的补救路径。我们在金融数据集上进行了实证评估,将其与直接向数据实例添加噪声及使用DP合成数据生成图的方法进行比较,观察到PrivRecourse能提供兼具隐私性与真实性的路径。