This technical report describes the intersection of process mining and large language models (LLMs), specifically focusing on the abstraction of traditional and object-centric process mining artifacts into textual format. We introduce and explore various prompting strategies: direct answering, where the large language model directly addresses user queries; multi-prompt answering, which allows the model to incrementally build on the knowledge obtained through a series of prompts; and the generation of database queries, facilitating the validation of hypotheses against the original event log. Our assessment considers two large language models, GPT-4 and Google's Bard, under various contextual scenarios across all prompting strategies. Results indicate that these models exhibit a robust understanding of key process mining abstractions, with notable proficiency in interpreting both declarative and procedural process models. In addition, we find that both models demonstrate strong performance in the object-centric setting, which could significantly propel the advancement of the object-centric process mining discipline. Additionally, these models display a noteworthy capacity to evaluate various concepts of fairness in process mining. This opens the door to more rapid and efficient assessments of the fairness of process mining event logs, which has significant implications for the field. The integration of these large language models into process mining applications may open new avenues for exploration, innovation, and insight generation in the field.
翻译:本技术报告探讨了过程挖掘与大语言模型(LLMs)的交叉领域,重点研究了将传统及面向对象的过程挖掘工件抽象为文本格式的方法。我们提出并探索了多种提示策略:直接回答,即大语言模型直接响应用户提问;多轮提示回答,使模型能够逐步积累通过一系列提示获得的知识;以及生成数据库查询,以促进对原始事件日志中假设的验证。我们的评估涵盖了两个大语言模型(GPT-4和Google Bard)在不同上下文场景中所有提示策略的表现。结果表明,这些模型对过程挖掘的关键抽象概念具有稳健的理解能力,尤其在解释声明式和程序式过程模型方面表现出显著能力。此外,我们发现两个模型在面向对象设置中均展现出强劲性能,这有望极大推动面向对象过程挖掘学科的发展。同时,这些模型还展现出评估过程挖掘中多种公平性概念的卓越能力,为更快速高效地评估过程挖掘事件日志的公平性开辟了道路,对该领域具有重要启示。将大语言模型集成到过程挖掘应用中,可能为领域内的探索、创新和见解生成开辟新路径。