Operations Research (OR) relies on expert-driven modeling-a slow and fragile process ill-suited to novel scenarios. While large language models (LLMs) can automatically translate natural language into optimization models, existing approaches either rely on costly post-training or employ multi-agent frameworks, yet most still lack reliable collaborative error correction and task-specific retrieval, often leading to incorrect outputs. We propose MIRROR, a fine-tuning-free, end-to-end multi-agent framework that directly translates natural language optimization problems into mathematical models and solver code. MIRROR integrates two core mechanisms: (1) execution-driven iterative adaptive revision for automatic error correction, and (2) hierarchical retrieval to fetch relevant modeling and coding exemplars from a carefully curated exemplar library. Experiments show that MIRROR outperforms existing methods on standard OR benchmarks, with notable results on complex industrial datasets such as IndustryOR and Mamo-ComplexLP. By combining precise external knowledge infusion with systematic error correction, MIRROR provides non-expert users with an efficient and reliable OR modeling solution, overcoming the fundamental limitations of general-purpose LLMs in expert optimization tasks.
翻译:运筹学(OR)依赖于专家驱动的建模过程,这一过程缓慢且脆弱,难以适应新场景。虽然大语言模型(LLMs)能够自动将自然语言转化为优化模型,但现有方法要么依赖于成本高昂的后训练,要么采用多智能体框架,然而大多数方法仍缺乏可靠的协作纠错和任务特定检索,常常导致错误输出。我们提出了MIRROR,一个无需微调、端到端的多智能体框架,可直接将自然语言描述的优化问题转化为数学模型和求解器代码。MIRROR集成了两个核心机制:(1) 执行驱动的迭代自适应修正,用于自动纠错;(2) 分层检索,从一个精心构建的范例库中获取相关的建模和编码范例。实验表明,MIRROR在标准OR基准测试中优于现有方法,在复杂工业数据集(如IndustryOR和Mamo-ComplexLP)上取得了显著成果。通过将精确的外部知识注入与系统化纠错相结合,MIRROR为非专业用户提供了一个高效可靠的OR建模解决方案,克服了通用LLMs在专业优化任务中的根本局限性。