We compare lightweight automata-based models (n-grams) with neural architectures (LSTM, Transformer) for next-activity prediction in streaming event logs. Experiments on synthetic patterns and five real-world process mining datasets show that n-grams with appropriate context windows achieve comparable accuracy to neural models while requiring substantially fewer resources. Unlike windowed neural architectures, which show unstable performance patterns, n-grams provide stable and consistent accuracy. While we demonstrate that classical ensemble methods like voting improve n-gram performance, they require running many agents in parallel during inference, increasing memory consumption and latency. We propose an ensemble method, the promotion algorithm, that dynamically selects between two active models during inference, reducing overhead compared to classical voting schemes. On real-world datasets, these ensembles match or exceed the accuracy of non-windowed neural models with lower computational cost.
翻译:我们比较了轻量级基于自动机的模型(n-gram)与神经架构(LSTM、Transformer)在流式事件日志中进行下一活动预测的效果。在合成模式及五个真实世界流程挖掘数据集上的实验表明,具有适当上下文窗口的n-gram在达到与神经模型相当准确性的同时,所需资源大幅减少。与表现不稳定的窗口化神经架构不同,n-gram提供稳定且一致的准确性。尽管我们证明投票等经典集成方法能提升n-gram性能,但它们在推理时需要并行运行多个智能体,从而增加了内存消耗和延迟。我们提出一种集成方法——促进算法,该算法在推理过程中动态选择两个活动模型,相比经典投票方案降低了开销。在真实世界数据集上,这些集成模型以较低的计算成本达到或超越了非窗口化神经模型的准确性。