The shift from the understanding and prediction of processes to their optimization offers great benefits to businesses and other organizations. Precisely timed process interventions are the cornerstones of effective optimization. Prescriptive process monitoring (PresPM) is the sub-field of process mining that concentrates on process optimization. The emerging PresPM literature identifies state-of-the-art methods, causal inference (CI) and reinforcement learning (RL), without presenting a quantitative comparison. Most experiments are carried out using historical data, causing problems with the accuracy of the methods' evaluations and preempting online RL. Our contribution consists of experiments on timed process interventions with synthetic data that renders genuine online RL and the comparison to CI possible, and allows for an accurate evaluation of the results. Our experiments reveal that RL's policies outperform those from CI and are more robust at the same time. Indeed, the RL policies approach perfect policies. Unlike CI, the unaltered online RL approach can be applied to other, more generic PresPM problems such as next best activity recommendations. Nonetheless, CI has its merits in settings where online learning is not an option.
翻译:从流程的理解与预测转向其优化,能够为企业和其他组织带来巨大收益。精确时机的流程干预是实现有效优化的基石。处方流程监控(PresPM)是流程挖掘领域中专注于流程优化的子方向。新兴的PresPM文献指出了最先进的方法——因果推断(CI)与强化学习(RL),但未能提供量化比较。多数实验基于历史数据开展,导致方法评估准确性存在问题,并阻碍了在线RL的应用。我们的贡献在于:采用合成数据开展时点流程干预实验,这使得真正的在线RL及与CI的对比成为可能,并能对结果进行精确评估。实验表明,RL策略在性能上优于CI策略,且同时具备更强的鲁棒性。事实上,RL策略已接近完美策略。与CI不同,未经修改的在线RL方法可应用于更通用的其他PresPM问题,例如下一步最佳活动推荐。尽管如此,在无法进行在线学习的场景中,CI仍具有其独特价值。