This paper contributes to speeding up the design and deployment of engineering dynamical systems by proposing a strategy for exploiting domain and expert knowledge for the automated generation of a dynamical system computational model starting from a corpus of documents relevant to the dynamical system of interest and an input document describing the specific system. This strategy is implemented in five steps and, crucially, it uses system modeling language diagrams (SysML) to extract accurate information about the dependencies, attributes, and operations of components. Natural Language Processing (NLP) strategies and Large Language Models (LLMs) are employed in specific tasks to improve intermediate outputs of the SySML diagrams automated generation, such as: list of key nouns; list of extracted relationships; list of key phrases and key relationships; block attribute values; block relationships; and BDD diagram generation. The applicability of automated SysML diagram generation is illustrated with different case studies. The computational models of complex dynamical systems from SysML diagrams are then obtained via code generation and computational model generation steps. In the code generation step, NLP strategies are used for summarization, while LLMs are used for validation only. The proposed approach is not limited to a specific system, domain, or computational software. Domain and expert knowledge is integrated by providing a set of equation implementation templates. This work represents one of the first attempts to build an automatic pipeline for this area. The applicability of the proposed approach is shown via an end-to-end example from text to model of a simple pendulum, showing improved performance compared to results yielded by LLMs only in zero-shot mode.
翻译:本文提出一种策略,旨在通过利用领域知识与专家经验,从与目标动力学系统相关的文档集及描述具体系统的输入文档出发,自动生成动力学系统计算模型,从而加速工程动力学系统的设计与部署。该策略分五步实现,其关键创新在于采用系统建模语言图准确提取组件的依赖关系、属性及操作信息。特定任务中应用了自然语言处理策略与大型语言模型,以改进SysML图自动生成的中间输出,主要包括:关键名词列表、提取的关系列表、关键短语与关键关系列表、模块属性值、模块关系以及块定义图生成。通过不同案例验证了SysML图自动生成的可行性。随后,经由代码生成与计算模型生成步骤,从SysML图获得复杂动力学系统的计算模型。在代码生成阶段,采用自然语言处理策略进行摘要生成,而大型语言模型仅用于验证。本方法不局限于特定系统、领域或计算软件,通过提供方程实现模板集整合领域知识与专家经验。该工作是该领域构建自动化管线的首批尝试之一。通过从文本到单摆模型的端到端实例验证了方法的适用性,结果表明其性能显著优于仅使用大型语言模型在零样本模式下的输出。