Counterfactual (CF) explanations for machine learning (ML) models are preferred by end-users, as they explain the predictions of ML models by providing a recourse (or contrastive) case to individuals who are adversely impacted by predicted outcomes. Existing CF explanation methods generate recourses under the assumption that the underlying target ML model remains stationary over time. However, due to commonly occurring distributional shifts in training data, ML models constantly get updated in practice, which might render previously generated recourses invalid and diminish end-users trust in our algorithmic framework. To address this problem, we propose RoCourseNet, a training framework that jointly optimizes predictions and recourses that are robust to future data shifts. This work contains four key contributions: (1) We formulate the robust recourse generation problem as a tri-level optimization problem which consists of two sub-problems: (i) a bi-level problem that finds the worst-case adversarial shift in the training data, and (ii) an outer minimization problem to generate robust recourses against this worst-case shift. (2) We leverage adversarial training to solve this tri-level optimization problem by: (i) proposing a novel virtual data shift (VDS) algorithm to find worst-case shifted ML models via explicitly considering the worst-case data shift in the training dataset, and (ii) a block-wise coordinate descent procedure to optimize for prediction and corresponding robust recourses. (3) We evaluate RoCourseNet's performance on three real-world datasets, and show that RoCourseNet consistently achieves more than 96% robust validity and outperforms state-of-the-art baselines by at least 10% in generating robust CF explanations. (4) Finally, we generalize the RoCourseNet framework to accommodate any parametric post-hoc methods for improving robust validity.
翻译:机器学习模型的反事实解释受到了终端用户的青睐,因为这类解释通过为受预测结果负面影响的个体提供反事实(或对比性)案例,阐明了模型的预测原理。现有反事实解释方法均假设底层目标机器学习模型随时间保持稳定,然而训练数据中普遍存在分布偏移,导致机器学习模型在实践中需要持续更新,这可能会使先前生成的反事实解释失效,并削弱终端用户对算法框架的信任。针对该问题,我们提出RoCourseNet——一种联合优化预测结果及其反事实解释的训练框架,使其能够抵御未来数据分布偏移。本文包含四项关键贡献:(1) 将鲁棒反事实解释生成问题形式化为三层优化问题,其中包含两个子问题:(i) 寻找训练数据最坏对抗偏移的双层优化问题,以及(ii) 针对该最坏情况偏移生成鲁棒反事实解释的外层最小化问题。(2) 通过以下策略利用对抗训练求解该三层优化问题:(i) 提出新型虚拟数据偏移算法,通过显式考虑训练数据中的最坏情况数据偏移来搜索最坏偏移机器学习模型,以及(ii) 采用分块坐标下降过程优化预测结果及其对应鲁棒反事实解释。(3) 在三个真实数据集上评估RoCourseNet性能,结果显示RoCourseNet始终实现超过96%的鲁棒有效性,且在生成鲁棒反事实解释方面较当前最优基线模型提升至少10%。(4) 最终将RoCourseNet框架推广至任意参数化后处理方法,以提升鲁棒有效性。