In the domain of model-based engineering, models are essential components that enable system design and analysis. Traditionally, the creation of these models has been a manual process requiring not only deep modeling expertise but also substantial domain knowledge of target systems. With the rapid advancement of generative artificial intelligence, large language models (LLMs) show potential for automating model generation. This work explores the generation of instance models using LLMs, focusing specifically on producing XMI-based instance models from Ecore metamodels and natural language specifications. We observe that current LLMs struggle to directly generate valid XMI models. To address this, we propose a two-step approach: first, using LLMs to produce a simplified structured output containing all necessary instance model information, namely a conceptual instance model, and then compiling this intermediate representation into a valid XMI file. The conceptual instance model is format-independent, allowing it to be transformed into various modeling formats via different compilers. The feasibility of the proposed method has been demonstrated using several LLMs, including GPT-4o, o1-preview, Llama 3.1 (8B and 70B). Results show that the proposed method significantly improves the usability of LLMs for instance model generation tasks. Notably, the smaller open-source model, Llama 3.1 70B, demonstrated performance comparable to proprietary GPT models within the proposed framework.
翻译:在基于模型的工程领域,模型是实现系统设计与分析的核心要素。传统上,这些模型的创建一直是一个手动过程,不仅需要深厚的建模专业知识,还需要对目标系统具备充分的领域知识。随着生成式人工智能的快速发展,大语言模型(LLMs)在自动化模型生成方面展现出潜力。本研究探索利用LLMs生成实例模型,特别关注从Ecore元模型和自然语言规约生成基于XMI的实例模型。我们观察到当前LLMs难以直接生成有效的XMI模型。为解决此问题,我们提出一种两步法:首先使用LLMs生成包含所有必要实例模型信息的简化结构化输出(即概念实例模型),然后将该中间表示编译为有效的XMI文件。概念实例模型与格式无关,可通过不同编译器转换为多种建模格式。所提方法的可行性已通过多个LLMs(包括GPT-4o、o1-preview、Llama 3.1(8B和70B))得到验证。结果表明,所提方法显著提升了LLMs在实例模型生成任务中的可用性。值得注意的是,在提出的框架内,较小的开源模型Llama 3.1 70B表现出与专有GPT模型相当的性能。