Building general-purpose robots that can operate seamlessly, in any environment, with any object, and utilizing various skills to complete diverse tasks has been a long-standing goal in Artificial Intelligence. Unfortunately, however, most existing robotic systems have been constrained - having been designed for specific tasks, trained on specific datasets, and deployed within specific environments. These systems usually require extensively-labeled data, rely on task-specific models, have numerous generalization issues when deployed in real-world scenarios, and struggle to remain robust to distribution shifts. Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i.e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like. We begin by providing an overview of what constitutes a conventional robotic system and the fundamental barriers to making it universally applicable. Next, we establish a taxonomy to discuss current work exploring ways to leverage existing foundation models for robotics and develop ones catered to robotics. Finally, we discuss key challenges and promising future directions in using foundation models for enabling general-purpose robotic systems. We encourage readers to view our living GitHub repository of resources, including papers reviewed in this survey as well as related projects and repositories for developing foundation models for robotics.
翻译:构建能够在任意环境中、操作任意物体、运用多种技能完成多样化任务的通用机器人,一直是人工智能领域的长期目标。然而遗憾的是,现有大多数机器人系统受限于特定任务、特定数据集训练和特定环境部署,通常需要大量标注数据、依赖任务专用模型、在现实场景部署时存在诸多泛化问题,且难以应对分布偏移。受自然语言处理和计算机视觉研究领域中网络规模、大容量预训练模型(即基础模型)在开放集性能和内容生成能力方面取得的突破性进展启发,本综述致力于探究:(i) 如何将自然语言处理和计算机视觉领域现有的基础模型应用于机器人领域,以及(ii) 面向机器人领域的专用基础模型应具备何种形态。我们首先概述传统机器人系统的构成要素及实现通用应用面临的根本障碍,继而建立分类体系讨论当前利用现有基础模型赋能机器人系统、并开发机器人专用基础模型的相关工作。最后,我们探讨了利用基础模型构建通用机器人系统的关键挑战与未来研究方向。欢迎读者访问本综述维护的GitHub实时资源库,其中包含本综述引用的论文、相关项目及机器人基础模型开发资源。