Predictive process analytics focuses on predicting future states, such as the outcome of running process instances. These techniques often use machine learning models or deep learning models (such as LSTM) to make such predictions. However, these deep models are complex and difficult for users to understand. Counterfactuals answer ``what-if'' questions, which are used to understand the reasoning behind the predictions. For example, what if instead of emailing customers, customers are being called? Would this alternative lead to a different outcome? Current methods to generate counterfactual sequences either do not take the process behavior into account, leading to generating invalid or infeasible counterfactual process instances, or heavily rely on domain knowledge. In this work, we propose a general framework that uses evolutionary methods to generate counterfactual sequences. Our framework does not require domain knowledge. Instead, we propose to train a Markov model to compute the feasibility of generated counterfactual sequences and adapt three other measures (delta in outcome prediction, similarity, and sparsity) to ensure their overall viability. The evaluation shows that we generate viable counterfactual sequences, outperform baseline methods in viability, and yield similar results when compared to the state-of-the-art method that requires domain knowledge.
翻译:摘要:预测性流程分析专注于预测未来状态,例如运行中流程实例的结果。这些技术通常使用机器学习模型或深度学习模型(如LSTM)来做出此类预测。然而,这些深度模型复杂且难以被用户理解。反事实解释回答了“如果……会怎样”的问题,用于理解预测背后的推理。例如,如果不对客户发送电子邮件,而是直接致电客户,结果会如何?这种替代方案是否会导致不同的结果?当前生成反事实序列的方法要么未考虑流程行为,导致生成无效或不可行的反事实流程实例,要么严重依赖领域知识。在本文中,我们提出一个通用框架,利用进化方法生成反事实序列。我们的框架不需要领域知识;相反,我们提出训练一个马尔可夫模型来计算所生成反事实序列的可行性,并适配三种其他度量(结果预测的差异、相似性和稀疏性)以确保其整体可行性。评估表明,我们生成了可行的反事实序列,在可行性上优于基线方法,且与需要领域知识的最先进方法相比,结果相似。