We introduce Debate2Create (D2C), a multi-agent LLM framework that formulates robot co-design as structured, iterative debate grounded in physics-based evaluation. A design agent and control agent engage in a thesis-antithesis-synthesis loop, while pluralistic LLM judges provide multi-objective feedback to steer exploration. Across five MuJoCo locomotion benchmarks, D2C achieves up to $3.2\times$ the default Ant score and $\sim9\times$ on Swimmer, outperforming prior LLM-based methods and black-box optimization. Iterative debate yields 18--35% gains over compute-matched zero-shot generation, and D2C-generated rewards transfer to default morphologies in 4/5 tasks. Our results demonstrate that structured multi-agent debate offers an effective alternative to hand-designed objectives for joint morphology-reward optimization.
翻译:本文提出Debate2Create(D2C),一种将机器人协同设计构建为基于物理评估的结构化迭代辩论的多智能体大语言模型框架。设计智能体与控制智能体通过“正题-反题-合题”循环展开辩论,同时多元化的大语言模型评委提供多目标反馈以引导探索。在五个MuJoCo运动基准测试中,D2C在Ant环境中达到默认性能的$3.2\times$,在Swimmer环境中达到$\sim9\times$,优于现有基于大语言模型的方法与黑盒优化方法。相比计算资源匹配的零样本生成方法,迭代辩论带来18–35%的性能提升,且D2C生成的奖励函数在4/5任务中可迁移至默认形态结构。我们的研究结果表明,结构化多智能体辩论为形态-奖励联合优化提供了一种替代人工设计目标的有效方案。