Long traces and large event logs that originate from sensors and prediction models are becoming more common in our data-rich world. In such circumstances, conformance checking, a key task in process mining, can become computationally infeasible due to the exponential complexity of finding an optimal alignment. This paper introduces a novel sliding window approach to address these scalability challenges while preserving the interpretability of alignment-based methods. By breaking down traces into manageable subtraces and iteratively aligning each with the process model, our method significantly reduces the search space. The approach uses global information that captures structural properties of the trace and the process model to make informed alignment decisions, discarding unpromising alignments even if they are optimal for a local subtrace. This improves the overall accuracy of the results. Experimental evaluations demonstrate that the proposed method consistently finds optimal alignments in most cases and highlight its scalability. This is further supported by a theoretical complexity analysis, which shows the reduced growth of the search space compared to other common conformance checking methods. This work provides a valuable contribution towards efficient conformance checking for large-scale process mining applications.
翻译:由传感器和预测模型产生的长轨迹与大规模事件日志在数据丰富的世界中日益普遍。在此背景下,作为过程挖掘核心任务的一致性检测,因寻找最优对齐的指数级复杂度而可能面临计算不可行性问题。本文提出一种新颖的滑动窗口方法,在保持基于对齐方法可解释性的同时应对这些可扩展性挑战。通过将轨迹分解为可管理的子轨迹,并迭代地将每个子轨迹与过程模型对齐,该方法显著缩减了搜索空间。该方法利用捕获轨迹与过程模型结构特性的全局信息来做出明智的对齐决策,即使某些对齐在局部子轨迹中为最优解,也会予以舍弃。这提升了结果的整体准确性。实验评估表明,所提方法在多数情况下能稳定找到最优对齐,并凸显了其可扩展性。理论复杂度分析进一步支持了这一结论,该分析显示与其他常见一致性检测方法相比,本方法中搜索空间的增长速度有所降低。本项工作为大规模过程挖掘应用中高效一致性检测做出了有价值的贡献。