Counterfactual explanations, and their associated algorithmic recourse, are typically leveraged to understand, explain, and potentially alter a prediction coming from a black-box classifier. In this paper, we propose to extend the use of counterfactuals to evaluate progress in sequential decision making tasks. To this end, we introduce a model-agnostic modular framework, TraCE (Trajectory Counterfactual Explanation) scores, which is able to distill and condense progress in highly complex scenarios into a single value. We demonstrate TraCE's utility across domains by showcasing its main properties in two case studies spanning healthcare and climate change.
翻译:反事实解释及其相关算法追责机制通常用于理解、解释并可能改变黑盒分类器产生的预测结果。本文提出将反事实解释的使用扩展到评估序列决策任务中的进展。为此,我们引入了一个与模型无关的模块化框架——TraCE(轨迹反事实解释)评分,该框架能够将高度复杂场景中的进展提炼并压缩为单一数值。我们通过在医疗保健和气候变化两个案例研究中展示其主要特性,证明了TraCE跨领域的实用性。