There has been a significant research interest in employing large language models to empower intelligent robots with complex reasoning. Existing work focuses on harnessing their abilities to reason about the histories of their actions and observations. In this paper, we explore a new dimension in which large language models may benefit robotics planning. In particular, we propose Statler, a framework in which large language models are prompted to maintain an estimate of the world state, which are often unobservable, and track its transition as new actions are taken. Our framework then conditions each action on the estimate of the current world state. Despite being conceptually simple, our Statler framework significantly outperforms strong competing methods (e.g., Code-as-Policies) on several robot planning tasks. Additionally, it has the potential advantage of scaling up to more challenging long-horizon planning tasks.
翻译:近年来,利用大型语言模型赋予智能机器人复杂推理能力已成为研究热点。现有工作主要聚焦于利用模型对历史动作与观察进行推理的能力。本文探索了大型语言模型赋能机器人规划的新维度——我们提出Statler框架,该框架通过提示大型语言模型持续维护对世界状态(通常不可直接观测)的估计,并追踪新动作执行时的状态转移过程。在此基础上,该框架根据当前世界状态的估计结果(而非原始观察)生成每个动作。尽管概念简单,Statler框架在多个机器人规划任务中显著优于Code-as-Policies等强竞争方法,且具有可扩展至更具挑战性的长周期规划任务的潜在优势。