We introduce MCCE: Monte Carlo sampling of valid and realistic Counterfactual Explanations for tabular data, a novel counterfactual explanation method that generates on-manifold, actionable and valid counterfactuals by modeling the joint distribution of the mutable features given the immutable features and the decision. Unlike other on-manifold methods that tend to rely on variational autoencoders and have strict prediction model and data requirements, MCCE handles any type of prediction model and categorical features with more than two levels. MCCE first models the joint distribution of the features and the decision with an autoregressive generative model where the conditionals are estimated using decision trees. Then, it samples a large set of observations from this model, and finally, it removes the samples that do not obey certain criteria. We compare MCCE with a range of state-of-the-art on-manifold counterfactual methods using four well-known data sets and show that MCCE outperforms these methods on all common performance metrics and speed. In particular, including the decision in the modeling process improves the efficiency of the method substantially.
翻译:本文提出MCCE方法——一种针对表格数据的有效且现实的反事实解释的蒙特卡洛采样方法。该方法通过建模可变特征在给定不可变特征及决策条件下的联合分布,生成流形上、可操作且有效的反事实解释。不同于其他依赖变分自编码器、对预测模型和数据有严格要求的流形上方法,MCCE可处理任意类型的预测模型及超过两个类别的分类特征。首先利用决策树估计条件概率,通过自回归生成模型对特征与决策的联合分布进行建模;随后从该模型中采样大量观测数据;最后剔除不满足特定准则的样本。我们使用四个知名数据集将MCCE与多种前沿流形上反事实方法进行对比,实验表明MCCE在各项通用性能指标及运行速度上均优于这些方法。特别地,将决策信息纳入建模过程显著提升了方法的效率。