Robotics learning highly relies on human expertise and efforts, such as demonstrations, design of reward functions in reinforcement learning, performance evaluation using human feedback, etc. However, reliance on human assistance can lead to expensive learning costs and make skill learning difficult to scale. In this work, we introduce the Large Language Model Supervised Robotics Text2Skill Autonomous Learning (ARO) framework, which aims to replace human participation in the robot skill learning process with large-scale language models that incorporate reward function design and performance evaluation. We provide evidence that our approach enables fully autonomous robot skill learning, capable of completing partial tasks without human intervention. Furthermore, we also analyze the limitations of this approach in task understanding and optimization stability.
翻译:机器人学习高度依赖人工专业知识与努力,例如演示、强化学习中奖励函数的设计、基于人类反馈的性能评估等。然而,这种对人类辅助的依赖会导致学习成本高昂,并使技能学习难以规模化。本文提出大型语言模型监督的机器人文本到技能自主学习方法(ARO)框架,旨在通过集成奖励函数设计与性能评估功能的大规模语言模型,替代人类在机器人技能学习过程中的参与。我们通过实验证明,该方法能够实现完全自主的机器人技能学习,无需人工干预即可完成部分任务。此外,我们还分析了该方法在任务理解与优化稳定性方面的局限性。