Large language models (LLMs) have demonstrated the potential to perform high-level planning. Yet, it remains a challenge for LLMs to comprehend low-level commands, such as joint angle targets or motor torques. This paper proposes an approach to use foot contact patterns as an interface that bridges human commands in natural language and a locomotion controller that outputs these low-level commands. This results in an interactive system for quadrupedal robots that allows the users to craft diverse locomotion behaviors flexibly. We contribute an LLM prompt design, a reward function, and a method to expose the controller to the feasible distribution of contact patterns. The results are a controller capable of achieving diverse locomotion patterns that can be transferred to real robot hardware. Compared with other design choices, the proposed approach enjoys more than 50% success rate in predicting the correct contact patterns and can solve 10 more tasks out of a total of 30 tasks. Our project site is: https://saytap.github.io.
翻译:大型语言模型(LLMs)已展现出执行高级规划的潜力。然而,让LLMs理解低级指令(如关节角度目标或电机扭矩)仍具挑战性。本文提出一种方法,以足部接触模式作为桥梁,连接自然语言的人类指令与输出这些低级指令的运动控制器。这为四足机器人构建了一个交互式系统,使用户能够灵活地设计多样化的运动行为。我们贡献了LLM提示设计、奖励函数,以及将控制器暴露于可行接触模式分布的方法。研究结果得到的控制器能够实现多种运动模式,并可迁移至真实机器人硬件。与其他设计选择相比,所提方法在预测正确接触模式方面成功率超50%,且可在总计30个任务中多解决10个任务。我们的项目网站为:https://saytap.github.io。