Prescriptive process monitoring methods seek to optimize the performance of business processes by triggering interventions at runtime, thereby increasing the probability of positive case outcomes. These interventions are triggered according to an intervention policy. Reinforcement learning has been put forward as an approach to learning intervention policies through trial and error. Existing approaches in this space assume that the number of resources available to perform interventions in a process is unlimited, an unrealistic assumption in practice. This paper argues that, in the presence of resource constraints, a key dilemma in the field of prescriptive process monitoring is to trigger interventions based not only on predictions of their necessity, timeliness, or effect but also on the uncertainty of these predictions and the level of resource utilization. Indeed, committing scarce resources to an intervention when the necessity or effects of this intervention are highly uncertain may intuitively lead to suboptimal intervention effects. Accordingly, the paper proposes a reinforcement learning approach for prescriptive process monitoring that leverages conformal prediction techniques to consider the uncertainty of the predictions upon which an intervention decision is based. An evaluation using real-life datasets demonstrates that explicitly modeling uncertainty using conformal predictions helps reinforcement learning agents converge towards policies with higher net intervention gain
翻译:规范过程监控方法旨在通过运行时触发干预措施来优化业务流程绩效,从而提高积极案例结果的发生概率。这些干预措施依据干预策略进行触发。强化学习已被提出作为一种通过试错学习干预策略的方法。现有该领域的研究假设用于执行干预的资源数量不受限制,而这在实践中是不切实际的。本文认为,在资源约束存在的情况下,规范过程监控领域的一个关键困境在于:不仅需要基于干预必要性、及时性或效果的预测来触发干预,还需考虑这些预测的不确定性以及资源利用水平。实际上,当干预的必要性或效果高度不确定时,将稀缺资源投入干预可能会直觉上导致次优的干预效果。为此,本文提出一种基于强化学习的规范过程监控方法,该方法利用共形预测技术来考量干预决策所依据预测的不确定性。使用真实数据集的评估表明,通过共形预测显式建模不确定性有助于强化学习代理收敛至具有更高净干预收益的策略。