The plethora of algorithms in the research field of process mining builds on directly-follows relations. Even though various improvements have been made in the last decade, there are serious weaknesses of these relationships. Once events associated with different objects that relate with a cardinality of 1:N and N:M to each other, techniques based on directly-follows relations produce spurious relations, self-loops, and back-jumps. This is due to the fact that event sequence as described in classical event logs differs from event causation. In this paper, we address the research problem of representing the causal structure of process-related event data. To this end, we develop a new approach called Causal Process Mining. This approach renounces the use of flat event logs and considers relational databases of event data as an input. More specifically, we transform the relational data structures based on the Causal Process Template into what we call Causal Event Graph. We evaluate our approach and compare its outputs with techniques based on directly-follows relations in a case study with an European food production company. Our results demonstrate the benefits for enriching process mining with additional knowledge from the domain.
翻译:过程挖掘研究领域中的众多算法均建立在直接跟随关系之上。尽管过去十年间已取得各种改进,但这些关系仍存在严重缺陷。一旦事件与具有1:N和N:M基数的不同对象相关联,基于直接跟随关系的技术就会产生虚假关系、自循环和回跳现象。这是由于经典事件日志中描述的事件序列与事件因果关系存在差异。本文针对表征过程相关事件数据因果结构的研究问题,提出了一种名为"因果过程挖掘"的新方法。该方法摒弃了扁平事件日志的使用,转而将事件数据的关系数据库作为输入。更具体地说,我们基于因果过程模板将关系数据结构转化为所谓的因果事件图。我们通过一家欧洲食品生产企业的案例研究,将本方法及其输出结果与基于直接跟随关系的技术进行了对比评估。研究结果证明了利用领域知识对过程挖掘进行丰富化的显著优势。