In recent years, various machine and deep learning architectures have been successfully introduced to the field of predictive process analytics. Nevertheless, the inherent opacity of these algorithms poses a significant challenge for human decision-makers, hindering their ability to understand the reasoning behind the predictions. This growing concern has sparked the introduction of counterfactual explanations, designed as human-understandable what if scenarios, to provide clearer insights into the decision-making process behind undesirable predictions. The generation of counterfactual explanations, however, encounters specific challenges when dealing with the sequential nature of the (business) process cases typically used in predictive process analytics. Our paper tackles this challenge by introducing a data-driven approach, REVISEDplus, to generate more feasible and plausible counterfactual explanations. First, we restrict the counterfactual algorithm to generate counterfactuals that lie within a high-density region of the process data, ensuring that the proposed counterfactuals are realistic and feasible within the observed process data distribution. Additionally, we ensure plausibility by learning sequential patterns between the activities in the process cases, utilising Declare language templates. Finally, we evaluate the properties that define the validity of counterfactuals.
翻译:近年来,各类机器学习和深度学习架构已成功应用于预测性流程分析领域。然而,这些算法固有的不透明性对人类决策者构成了重大挑战,阻碍了其理解预测背后推理过程的能力。这一日益增长的关注促使反事实解释的引入——设计为人类可理解的"如果...会怎样"情景,以更清晰地洞察不良预测背后的决策流程。但当处理预测性流程分析中典型使用的(业务)流程案例的序列性质时,反事实解释的生成面临特殊挑战。本文通过引入数据驱动方法REVISEDplus解决了这一挑战,以生成更可信且合理的反事实解释。首先,我们限制反事实算法仅生成位于流程数据高密度区域内的反事实实例,确保所提出的反事实在观测到的流程数据分布内既现实又可行。此外,我们通过利用Declare语言模板学习流程案例中活动之间的序列模式来保障合理性。最后,我们评估了定义反事实有效性的各项属性。