Robotic planning algorithms direct agents to perform actions within diverse environments to accomplish a task. Large Language Models (LLMs) like PaLM 2, GPT-3.5, and GPT-4 have revolutionized this domain, using their embedded real-world knowledge to tackle complex tasks involving multiple agents and objects. This paper introduces an innovative planning algorithm that integrates LLMs into the robotics context, enhancing task-focused execution and success rates. Key to our algorithm is a closed-loop feedback which provides real-time environmental states and error messages, crucial for refining plans when discrepancies arise. The algorithm draws inspiration from the human neural system, emulating its brain-body architecture by dividing planning across two LLMs in a structured, hierarchical fashion. Our method not only surpasses baselines within the VirtualHome Environment, registering a notable 35% average increase in task-oriented success rates, but achieves an impressive execution score of 85%, approaching the human-level benchmark of 94%. Moreover, effectiveness of the algorithm in real robot scenarios is shown using a realistic physics simulator and the Franka Research 3 Arm.
翻译:机器人规划算法引导智能体在多样化的环境中执行动作以完成任务。大型语言模型(如PaLM 2、GPT-3.5和GPT-4)凭借其嵌入的真实世界知识,已在该领域带来革命性变革,能够处理涉及多智能体和多物体的复杂任务。本文提出一种创新规划算法,将大型语言模型融入机器人学场景,提升任务导向的执行能力与成功率。该算法的核心在于闭环反馈机制,通过实时提供环境状态与错误信息,在计划偏差出现时实现关键性修正。算法受人类神经系统启发,采用阶层次结构将规划任务分配至两个大型语言模型,模拟人脑-躯体的协同架构。我们的方法不仅超越虚拟家居环境中的基线模型,使任务导向成功率平均提升35%,更以85%的执行得分逼近人类基准94%。此外,通过高保真物理仿真器与Franka Research 3型机械臂的联合实验,验证了该算法在真实机器人场景中的有效性。