Large Language Models (LLMs) are gaining popularity in the field of robotics. However, LLM-based robots are limited to simple, repetitive motions due to the poor integration between language models, robots, and the environment. This paper proposes a novel approach to enhance the performance of LLM-based autonomous manipulation through Human-Robot Collaboration (HRC). The approach involves using a prompted GPT-4 language model to decompose high-level language commands into sequences of motions that can be executed by the robot. The system also employs a YOLO-based perception algorithm, providing visual cues to the LLM, which aids in planning feasible motions within the specific environment. Additionally, an HRC method is proposed by combining teleoperation and Dynamic Movement Primitives (DMP), allowing the LLM-based robot to learn from human guidance. Real-world experiments have been conducted using the Toyota Human Support Robot for manipulation tasks. The outcomes indicate that tasks requiring complex trajectory planning and reasoning over environments can be efficiently accomplished through the incorporation of human demonstrations.
翻译:大型语言模型(LLM)在机器人学领域日益受到关注。然而,由于语言模型、机器人与环境之间的整合不足,基于LLM的机器人目前仅限于执行简单、重复的动作。本文提出一种创新方法,通过人机协作(HRC)来提升基于LLM的自主操作系统的性能。该方法利用提示驱动的GPT-4语言模型,将高层级语言指令分解为机器人可执行的动作序列。系统同时采用基于YOLO的感知算法,为LLM提供视觉线索,从而帮助其在特定环境中规划可行的运动轨迹。此外,本文通过结合遥操作与动态运动基元(DMP),提出一种HRC方法,使得基于LLM的机器人能够从人类示教中学习。研究使用丰田人类辅助机器人进行了实体环境下的操作任务实验。结果表明,通过融入人类示范,系统能够高效完成需要复杂轨迹规划与环境推理的任务。