Recent papers have introduced a novel approach to explain why a Predictive Process Monitoring (PPM) model for outcome-oriented predictions provides wrong predictions. Moreover, they have shown how to exploit the explanations, obtained using state-of-the art post-hoc explainers, to identify the most common features that induce a predictor to make mistakes in a semi-automated way, and, in turn, to reduce the impact of those features and increase the accuracy of the predictive model. This work starts from the assumption that frequent control flow patterns in event logs may represent important features that characterize, and therefore explain, a certain prediction. Therefore, in this paper, we (i) employ a novel encoding able to leverage DECLARE constraints in Predictive Process Monitoring and compare the effectiveness of this encoding with Predictive Process Monitoring state-of-the art encodings, in particular for the task of outcome-oriented predictions; (ii) introduce a completely automated pipeline for the identification of the most common features inducing a predictor to make mistakes; and (iii) show the effectiveness of the proposed pipeline in increasing the accuracy of the predictive model by validating it on different real-life datasets.
翻译:近期论文提出了一种新方法,用于解释面向结果的预测过程监控(Predictive Process Monitoring, PPM)模型为何给出错误预测,并展示了如何利用基于最先进的事后解释器获得的解释,以半自动方式识别导致预测器错误的最常见特征,进而减少这些特征的影响并提升预测模型的准确率。本文假设事件日志中的频繁控制流模式可能是表征并解释特定预测的重要特征。因此,我们:(i) 采用一种新型编码方法,在预测过程监控中利用DECLARE约束,并针对面向结果的预测任务,比较该编码与当前最先进编码的有效性;(ii) 引入一个完全自动化的流水线,用于识别导致预测器错误的最常见特征;(iii) 通过在多个真实数据集上的验证,展示所提流水线在提升预测模型准确率方面的有效性。