We study the problem of finding optimal sparse, manifold-aligned counterfactual explanations for classifiers. Canonically, this can be formulated as an optimization problem with multiple non-convex components, including classifier loss functions and manifold alignment (or \emph{plausibility}) metrics. The added complexity of enforcing \emph{sparsity}, or shorter explanations, complicates the problem further. Existing methods often focus on specific models and plausibility measures, relying on convex $\ell_1$ regularizers to enforce sparsity. In this paper, we tackle the canonical formulation using the accelerated proximal gradient (APG) method, a simple yet efficient first-order procedure capable of handling smooth non-convex objectives and non-smooth $\ell_p$ (where $0 \leq p < 1$) regularizers. This enables our approach to seamlessly incorporate various classifiers and plausibility measures while producing sparser solutions. Our algorithm only requires differentiable data-manifold regularizers and supports box constraints for bounded feature ranges, ensuring the generated counterfactuals remain \emph{actionable}. Finally, experiments on real-world datasets demonstrate that our approach effectively produces sparse, manifold-aligned counterfactual explanations while maintaining proximity to the factual data and computational efficiency.
翻译:我们研究为分类器寻找最优稀疏且流形对齐的反事实解释的问题。该问题通常可表述为包含多个非凸分量的优化问题,包括分类器损失函数和流形对齐(或称为“合理性”)度量。强制“稀疏性”(即更简洁的解释)的额外复杂性进一步加剧了问题的难度。现有方法通常针对特定模型和合理性度量,依赖凸的ℓ₁正则化器来强制稀疏性。本文中,我们采用加速近端梯度(APG)方法处理该经典表述,这是一种简单而高效的一阶过程,能够处理光滑非凸目标和非光滑ℓₚ(其中0 ≤ p < 1)正则化器。这使得我们的方法能够无缝整合多种分类器和合理性度量,同时生成更稀疏的解。我们的算法仅需可微的数据流形正则化器,并支持有界特征范围的框约束,确保生成的反事实保持“可操作性”。最后,在真实数据集上的实验表明,我们的方法能有效生成稀疏且流形对齐的反事实解释,同时保持与事实数据的接近性和计算效率。