Model-driven engineering (MDE) provides abstraction and analytical rigour, but industrial adoption in many domains has been limited by the cost of developing and maintaining models. Large language models (LLMs) can help shift this cost balance by supporting direct generation of models from natural-language (NL) descriptions. For domain-specific languages (DSLs), however, LLM-generated models may be less accurate than LLM-generated code in mainstream languages such as Python, due to the latter's dominance in LLM training corpora. We investigate this issue in mathematical optimization, with AMPL, a DSL with established industrial use. We introduce EXEOS, an LLM-based approach that derives AMPL models and Python code from NL problem descriptions and iteratively refines them with solver feedback. Using a public optimization dataset and real-world supply-chain cases from our industrial partner Kinaxis, we evaluate generated AMPL models against Python code in terms of executability and correctness. An ablation study with two LLM families shows that AMPL is competitive with, and sometimes better than, Python, and that our design choices in EXEOS improve the quality of generated specifications.
翻译:模型驱动工程(MDE)提供了抽象性和分析严谨性,但在许多领域的工业应用因模型开发与维护成本而受限。大型语言模型(LLM)能够通过支持从自然语言描述直接生成模型,帮助改变这种成本平衡。然而对于领域特定语言(DSL),由于主流编程语言(如Python)在LLM训练语料中占主导地位,LLM生成的DSL模型可能比LLM生成的代码准确性更低。我们在数学优化领域以具有成熟工业应用的DSL语言AMPL为对象研究该问题。我们提出EXEOS方法,该方法基于LLM从自然语言问题描述推导AMPL模型和Python代码,并利用求解器反馈进行迭代优化。通过公开优化数据集和工业合作伙伴Kinaxis的实际供应链案例,我们从可执行性和正确性两个维度评估生成的AMPL模型与Python代码。针对两种LLM系列的消融实验表明:AMPL与Python相比具有竞争力,有时表现更优;且EXEOS中的设计选择有效提升了生成规范的质量。