In recent years, process mining emerged as a proven technology to analyze and improve operational processes. An expanding range of organizations using process mining in their daily operation brings a broader spectrum of processes to be analyzed. Some of these processes are highly unstructured, making it difficult for traditional process discovery approaches to discover a start-to-end model describing the entire process. Therefore, the subdiscipline of Local Process Model (LPM) discovery tries to build a set of LPMs, i.e., smaller models that explain sub-behaviors of the process. However, like other pattern mining approaches, LPM discovery algorithms also face the problems of model explosion and model repetition, i.e., the algorithms may create hundreds if not thousands of models, and subsets of them are close in structure or behavior. This work proposes a three-step pipeline for grouping similar LPMs using various process model similarity measures. We demonstrate the usefulness of grouping through a real-life case study, and analyze the impact of different measures, the gravity of repetition in the discovered LPMs, and how it improves after grouping on multiple real event logs.
翻译:近年来,过程挖掘已成为分析和改进运营流程的成熟技术。越来越多组织在日常运营中应用过程挖掘,带来了更广泛的流程分析需求。其中部分流程高度非结构化,使得传统流程发现方法难以发现描述整个流程的端到端模型。因此,局部过程模型发现这一子领域致力于构建一组局部过程模型,即能够解释流程子行为的小型模型。然而,与其他模式挖掘方法类似,局部过程模型发现算法同样面临模型爆炸与模型重复问题——算法可能生成成百上千个模型,且这些模型的子集在结构或行为上高度相似。本研究提出一个三步流水线,通过多种过程模型相似度度量对相似局部过程模型进行分组。我们通过真实案例研究论证了分组的实用性,并分析了不同度量标准的影响、所发现局部过程模型中重复问题的严重程度,以及在多个真实事件日志上分组后的改进效果。