Large language models (LLMs) have demonstrated the potential to perform high-level planning. Yet, it remains a challenge for LLMs to comprehend low-level commands, such as joint angle targets or motor torques. This paper proposes an approach to use foot contact patterns as an interface that bridges human commands in natural language and a locomotion controller that outputs these low-level commands. This results in an interactive system for quadrupedal robots that allows the users to craft diverse locomotion behaviors flexibly. We contribute an LLM prompt design, a reward function, and a method to expose the controller to the feasible distribution of contact patterns. The results are a controller capable of achieving diverse locomotion patterns that can be transferred to real robot hardware. Compared with other design choices, the proposed approach enjoys more than 50% success rate in predicting the correct contact patterns and can solve 10 more tasks out of a total of 30 tasks. Our project site is: https://saytap.github.io.
翻译:大型语言模型(LLMs)已展现出进行高级规划的潜力。然而,让LLMs理解低级指令(如关节角度目标或电机扭矩)仍是一个挑战。本文提出了一种方法,即利用足部接触模式作为接口,连接自然语言中的人类指令与输出这些低级指令的运动控制器。这为四足机器人构建了一个交互式系统,使用户能够灵活地设计多样化的运动行为。我们贡献了一种LLM提示设计、一个奖励函数,以及一种将控制器暴露于可行接触模式分布的方法。研究结果是一个能够实现多种运动模式且可迁移至真实机器人硬件的控制器。与其他设计选择相比,所提方法在预测正确接触模式方面享有超过50%的成功率,并能在总共30个任务中额外解决10个任务。我们的项目网站是:https://saytap.github.io。