Process discovery algorithms learn process models from executed activity sequences, describing concurrency, causality, and conflict. Concurrent activities require observing multiple permutations, increasing data requirements, especially for processes with concurrent subprocesses such as hierarchical, composite, or distributed processes. While process discovery algorithms traditionally use sequences of activities as input, recently introduced object-centric process discovery algorithms can use graphs of activities as input, encoding partial orders between activities. As such, they contain the concurrency information of many sequences in a single graph. In this paper, we address the research question of reducing process discovery data requirements when using object-centric event logs for process discovery. We classify different real-life processes according to the control-flow complexity within and between subprocesses and introduce an evaluation framework to assess process discovery algorithm quality of traditional and object-centric process discovery based on the sample size. We complement this with a large-scale production process case study. Our results show reduced data requirements, enabling the discovery of large, concurrent processes such as manufacturing with little data, previously infeasible with traditional process discovery. Our findings suggest that object-centric process mining could revolutionize process discovery in various sectors, including manufacturing and supply chains.
翻译:过程发现算法从已执行的活动序列中学习过程模型,描述并发性、因果性和冲突关系。并发活动需要观测多种排列组合,这增加了数据需求,尤其对于包含层级化、复合型或分布式等并发子过程的过程而言更为显著。尽管传统过程发现算法以活动序列作为输入,但近期引入的以对象为中心的过程发现算法能够以活动图作为输入,编码活动间的偏序关系。因此,单个活动图即可包含多个序列的并发信息。本文聚焦于降低使用以对象为中心的事件日志进行过程发现时的数据需求这一研究问题。我们根据子过程内部与子过程间控制流复杂度对真实业务流程进行分类,并引入评估框架,基于样本量评估传统过程发现与以对象为中心的过程发现算法的质量。我们辅以大规模生产流程案例研究。结果表明数据需求显著降低,使得以往传统过程发现无法实现的任务(如通过少量数据发现包含制造流程在内的大型并发过程)成为可能。研究结论表明,以对象为中心的过程挖掘或将革新制造业与供应链等领域的流程发现范式。