Model-driven engineering (MDE) provides abstraction and analytical rigour, but industrial adoption in many domains has been limited by the cost of developing and maintaining models. Large language models (LLMs) can help shift this cost balance by supporting direct generation of models from natural-language (NL) descriptions. For domain-specific languages (DSLs), however, LLM-generated models may be less accurate than LLM-generated code in mainstream languages such as Python, due to the latter's dominance in LLM training corpora. We investigate this issue in mathematical optimization, with AMPL, a DSL with established industrial use. We introduce EXEOS, an LLM-based approach that derives AMPL models and Python code from NL problem descriptions and iteratively refines them with solver feedback. Using a public optimization dataset and real-world supply-chain cases from our industrial partner Kinaxis, we evaluate generated AMPL models against Python code in terms of executability and correctness. An ablation study with two LLM families shows that AMPL is competitive with, and sometimes better than, Python, and that our design choices in EXEOS improve the quality of generated specifications.
翻译:模型驱动工程(MDE)提供了抽象性和分析严谨性,但在许多领域的工业应用因模型开发与维护成本而受限。大型语言模型(LLM)通过支持从自然语言描述直接生成模型,有助于改变这种成本平衡。然而对于领域特定语言(DSL),由于主流编程语言(如Python)在LLM训练语料中占主导地位,LLM生成的DSL模型可能比其生成的Python代码准确性更低。我们在数学优化领域以工业级DSL语言AMPL为对象研究该问题,提出了EXEOS方法:基于LLM从自然语言问题描述推导AMPL模型与Python代码,并利用求解器反馈进行迭代优化。通过公开优化数据集和工业合作伙伴Kinaxis的实际供应链案例,我们从可执行性与正确性维度评估生成的AMPL模型与Python代码。针对两类LLM的消融实验表明:AMPL与Python相比具有竞争力且有时更优,同时EXEOS中的设计选择有效提升了生成规范的质量。