Recent advancements in multimodal large language models and vision-languageaction models have significantly driven progress in Embodied AI. As the field transitions toward more complex task scenarios, multi-agent system frameworks are becoming essential for achieving scalable, efficient, and collaborative solutions. This shift is fueled by three primary factors: increasing agent capabilities, enhancing system efficiency through task delegation, and enabling advanced human-agent interactions. To address the challenges posed by multi-agent collaboration, we propose the Multi-Agent Robotic System (MARS) Challenge, held at the NeurIPS 2025 Workshop on SpaVLE. The competition focuses on two critical areas: planning and control, where participants explore multi-agent embodied planning using vision-language models (VLMs) to coordinate tasks and policy execution to perform robotic manipulation in dynamic environments. By evaluating solutions submitted by participants, the challenge provides valuable insights into the design and coordination of embodied multi-agent systems, contributing to the future development of advanced collaborative AI systems.
翻译:近年来,多模态大语言模型与视觉-语言-行动模型的快速发展显著推动了具身人工智能的进步。随着该领域逐渐转向更复杂的任务场景,多智能体系统框架对于实现可扩展、高效且协作的解决方案变得至关重要。这一转变主要由三个关键因素驱动:智能体能力的持续提升、通过任务委派增强系统效率,以及实现更高级的人机交互。为应对多智能体协作带来的挑战,我们提出了在NeurIPS 2025 SpaVLE研讨会上举办的多智能体机器人系统(MARS)挑战赛。该竞赛聚焦于两个关键领域:规划与控制——参赛者将探索利用视觉-语言模型进行多智能体具身规划以协调任务,并通过策略执行在动态环境中完成机器人操作。通过对参赛者提交的解决方案进行评估,本挑战赛为具身多智能体系统的设计与协调提供了宝贵见解,有助于推动未来高级协作人工智能系统的发展。