The proliferation of Large Language Models (LLMs) has s fueled a shift in robot learning from automation towards general embodied Artificial Intelligence (AI). Adopting foundation models together with traditional learning methods to robot learning has increasingly gained recent interest research community and showed potential for real-life application. However, there are few literatures comprehensively reviewing the relatively new technologies combined with robotics. The purpose of this review is to systematically assess the state-of-the-art foundation model techniques in the robot learning and to identify future potential areas. Specifically, we first summarized the technical evolution of robot learning and identified the necessary preliminary preparations for foundation models including the simulators, datasets, foundation model framework. In addition, we focused on the following four mainstream areas of robot learning including manipulation, navigation, planning, and reasoning and demonstrated how the foundation model techniques can be adopted in the above scenarios. Furthermore, critical issues which are neglected in the current literatures including robot hardware and software decoupling, dynamic data, generalization performance with the presence of human, etc. were discussed. This review highlights the state-of-the-art progress of foundation models in robot learning and future research should focus on multimodal interaction especially dynamics data, exclusive foundation models for robots, and AI alignment, etc.
翻译:大规模语言模型(LLMs)的普及推动机器人学习从自动化向通用具身人工智能(AI)转变。将基础模型与传统学习方法相结合应用于机器人学习,近年来日益受到研究界关注,并展现出实际应用的潜力。然而,目前鲜有文献全面综述这些与机器人技术结合的新兴技术。本综述旨在系统评估机器人学习领域最先进的基础模型技术,并识别未来潜在发展方向。具体而言,我们首先梳理了机器人学习的技术演进,总结了基础模型所需的必要准备工作,包括仿真器、数据集和基础模型框架。此外,我们聚焦机器人学习的四大主流领域——操控、导航、规划与推理,阐述了基础模型技术在上述场景中的应用方式。同时,本文探讨了现有文献中忽略的关键问题,如机器人硬件与软件解耦、动态数据、人类在场条件下的泛化性能等。本综述凸显了基础模型在机器人学习中的前沿进展,指出未来研究应聚焦多模态交互(特别是动力学数据)、机器人专用基础模型及AI对齐等方向。