Deep learning has driven many recent advances in process analytics, especially for predictive and prescriptive monitoring. However, standard objectives such as cross-entropy optimize local next-step likelihoods and only implicitly capture control-flow structure. As a result, models can achieve high token-level accuracy while permitting imprecise global behaviour. We introduce DIFF-ERO, a conformance-aware loss function for deep learning models on process data. DIFF-ERO is a differentiable formulation of entropy-based stochastic conformance that incorporates control-flow information during training. Our approach constructs batch-level stochastic transition matrices with soft edge memberships, allowing structural precision and recall signals to directly inform backpropagation. The loss is model-agnostic and can be applied whenever the final representation parametrizes stochastic transitions. We instantiate DIFF-ERO in transformer encoder-decoder pipelines for next-activity prediction and use it jointly with cross-entropy to analyse its theoretical components with respect to convergence. Across benchmarks comparing other loss functions and targets, DIFF-ERO shows improved predictive performance where structure matters most while maintaining parity elsewhere. At the same time, the learned stochastic automaton converges towards the structural ground truth, indicating that the network internalizes process model structure.
翻译:摘要:深度学习推动了许多过程分析方面的最新进展,特别是在预测性和规范性监控方面。然而,诸如交叉熵之类的标准目标函数会优化局部的下一步似然,并且仅隐式地捕获控制流结构。因此,模型可能实现高词元级精度,同时却允许不精确的全局行为。我们提出了DIFF-ERO,这是一种用于过程数据深度学习模型的符合性感知损失函数。DIFF-ERO是基于熵的随机符合性的一种可微形式,在训练过程中融入了控制流信息。我们的方法通过软边隶属度构建批次级随机转移矩阵,使得结构精确度和召回率信号能够直接指导反向传播。该损失函数与模型无关,并且只要最终表示能够参数化随机转移,就可以应用。我们在用于下一个活动预测的Transformer编码器-解码器流水线中实例化DIFF-ERO,并将其与交叉熵联合使用,以分析其在收敛性方面的理论组成部分。在与其它损失函数和目标进行比较的基准测试中,DIFF-ERO在结构至关重要的地方展现出更优的预测性能,同时在其它方面保持性能相当。同时,学习到的随机自动机向结构真实情况收敛,这表明网络内化了流程模型结构。