An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. However, most traditional methods require predefined module design, which makes it hard to generalize to different goals. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. To tackle this, we propose a simple framework that achieves interactive task planning with language models. Our system incorporates both high-level planning and low-level function execution via language. We verify the robustness of our system in generating novel high-level instructions for unseen objectives and its ease of adaptation to different tasks by merely substituting the task guidelines, without the need for additional complex prompt engineering. Furthermore, when the user sends a new request, our system is able to replan accordingly with precision based on the new request, task guidelines and previously executed steps. Please check more details on our https://wuphilipp.github.io/itp_site and https://youtu.be/TrKLuyv26_g.
翻译:我们提出了一种基于语言模型的交互式机器人框架,该框架能够完成长周期任务规划,并能在执行过程中轻松泛化至新目标或不同任务。然而,传统方法大多需要预设模块设计,导致难以泛化至不同目标。近期基于大型语言模型的方法虽可支持更开放的任务规划,但往往需要繁琐的提示工程或领域特定的预训练模型。为解决该问题,我们提出一种简洁框架,通过语言模型实现交互式任务规划。该框架通过语言完成高层规划与低层函数执行的协同。实验验证了系统在生成未见目标的新型高层指令方面的鲁棒性,以及通过替换任务指南即可适配不同任务的便捷性——无需额外复杂的提示工程。当用户提出新需求时,系统能基于新指令、任务指南及已有执行步骤进行精确重规划。更多详情请参见 https://wuphilipp.github.io/itp_site 及 https://youtu.be/TrKLuyv26_g。