Recent advances in large language models (LLMs) have enabled the automatic generation of executable code for task planning and control in embodied agents such as robots, demonstrating the potential of LLM-based embodied intelligence. However, these LLM-based code-as-policies approaches often suffer from limited environmental grounding, particularly in dynamic or partially observable settings, leading to suboptimal task success rates due to incorrect or incomplete code generation. In this work, we propose a neuro-symbolic embodied task planning framework that incorporates explicit symbolic verification and interactive validation processes during code generation. In the validation phase, the framework generates exploratory code that actively interacts with the environment to acquire missing observations while preserving task-relevant states. This integrated process enhances the grounding of generated code, resulting in improved task reliability and success rates in complex environments. We evaluate our framework on RLBench and in real-world settings across dynamic, partially observable scenarios. Experimental results demonstrate that our framework improves task success rates by 46.2% over Code-as-Policies baselines and attains over 86.8% executability of task-relevant actions, thereby enhancing the reliability of task planning in dynamic environments.
翻译:近年来,大型语言模型(LLM)的进展使得能够为机器人等具身智能体自动生成用于任务规划与控制的执行代码,展现了基于LLM的具身智能潜力。然而,这些基于LLM的代码即策略方法常受限于环境基础不足,尤其在动态或部分可观测场景中,导致因代码生成错误或不完整而造成任务成功率欠佳。本文提出一种神经符号具身任务规划框架,在代码生成过程中融入了显式的符号验证与交互式验证流程。在验证阶段,该框架生成探索性代码,主动与环境交互以获取缺失的观测信息,同时保持任务相关状态。这一集成过程增强了生成代码的基础性,从而在复杂环境中提高了任务可靠性与成功率。我们在RLBench及真实世界动态、部分可观测场景中评估了该框架。实验结果表明,相较于代码即策略基线方法,我们的框架将任务成功率提升了46.2%,并实现了超过86.8%的任务相关动作可执行率,从而增强了动态环境中任务规划的可靠性。