Increasing the success rate of a process, i.e. the percentage of cases that end in a positive outcome, is a recurrent process improvement goal. At runtime, there are often certain actions (a.k.a. treatments) that workers may execute to lift the probability that a case ends in a positive outcome. For example, in a loan origination process, a possible treatment is to issue multiple loan offers to increase the probability that the customer takes a loan. Each treatment has a cost. Thus, when defining policies for prescribing treatments to cases, managers need to consider the net gain of the treatments. Also, the effect of a treatment varies over time: treating a case earlier may be more effective than later in a case. This paper presents a prescriptive monitoring method that automates this decision-making task. The method combines causal inference and reinforcement learning to learn treatment policies that maximize the net gain. The method leverages a conformal prediction technique to speed up the convergence of the reinforcement learning mechanism by separating cases that are likely to end up in a positive or negative outcome, from uncertain cases. An evaluation on two real-life datasets shows that the proposed method outperforms a state-of-the-art baseline.
翻译:提升流程成功率(即最终产生积极结果的案例百分比)是持续性的流程改进目标。在运行时,工作人员常可执行某些操作(即干预措施)以提高案例最终产生积极结果的概率。例如在贷款发放流程中,一种可能的干预措施是发放多份贷款要约,以增加客户接受贷款的概率。每种干预措施均产生成本。因此,在制定案例干预策略时,管理者需权衡干预措施的净收益。此外,干预效果随时间推移而变化:早期干预可能比后期干预更有效。本文提出了一种自动化决策的规范性监控方法,该方法融合因果推断与强化学习,以学习能最大化净收益的干预策略。方法利用共形预测技术,通过将可能产生积极或消极结果的案例与不确定案例分离,加速强化学习机制的收敛。基于两套真实数据集的评估表明,所提方法优于当前最优基线方法。