Large Language Models (LLMs) have emerged as powerful tools for complex Operations Research (OR) in automating optimization modeling. However, current methodologies heavily rely on prompt engineering (e.g., multi-agent cooperation) with proprietary LLMs, raising data privacy concerns that could be prohibitive in industry applications. To tackle this issue, we propose training open-source LLMs for optimization modeling. We identify four critical requirements for the training dataset of OR LLMs, design and implement OR-Instruct, a semi-automated process for creating synthetic data tailored to specific requirements. We also introduce the IndustryOR benchmark, the first industrial benchmark for testing LLMs on solving real-world OR problems. We apply the data from OR-Instruct to various open-source LLMs of 7b size (termed as ORLMs), resulting in a significantly improved capability for optimization modeling. Our best-performing ORLM achieves state-of-the-art performance on the NL4OPT, MAMO, and IndustryOR benchmarks. Our code and data will be available at \url{https://github.com/Cardinal-Operations/ORLM}.
翻译:大语言模型(LLMs)已成为运筹学(OR)领域中实现优化建模自动化的强大工具。然而,当前方法严重依赖基于专有大语言模型的提示工程(例如多智能体协作),引发了数据隐私问题,这可能在工业应用中构成障碍。为解决这一问题,我们提出为优化建模训练开源大语言模型。我们识别了运筹学大语言模型训练数据集的四项关键需求,设计并实现了OR-Instruct——一个为满足特定需求而创建合成数据的半自动化流程。我们还引入了IndustryOR基准,这是首个用于测试大语言模型解决现实世界运筹学问题的工业基准。我们将OR-Instruct生成的数据应用于多个7b规模的开源大语言模型(称为ORLMs),显著提升了其优化建模能力。我们性能最佳的ORLM在NL4OPT、MAMO和IndustryOR基准测试中取得了最先进的性能。我们的代码与数据将在 \url{https://github.com/Cardinal-Operations/ORLM} 发布。