Building general-purpose robots that can operate seamlessly, in any environment, with any object, and utilizing various skills to complete diverse tasks has been a long-standing goal in Artificial Intelligence. Unfortunately, however, most existing robotic systems have been constrained - having been designed for specific tasks, trained on specific datasets, and deployed within specific environments. These systems usually require extensively-labeled data, rely on task-specific models, have numerous generalization issues when deployed in real-world scenarios, and struggle to remain robust to distribution shifts. Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i.e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like. We begin by providing an overview of what constitutes a conventional robotic system and the fundamental barriers to making it universally applicable. Next, we establish a taxonomy to discuss current work exploring ways to leverage existing foundation models for robotics and develop ones catered to robotics. Finally, we discuss key challenges and promising future directions in using foundation models for enabling general-purpose robotic systems. We encourage readers to view our ``living`` GitHub repository of resources, including papers reviewed in this survey as well as related projects and repositories for developing foundation models for robotics.
翻译:构建能够在任何环境中与任意物体交互、运用多种技能完成多样化任务的通用机器人,是人工智能领域的长期目标。然而,现有大多数机器人系统受到诸多限制——它们专为特定任务设计,基于特定数据集训练,并部署在特定环境中。这些系统通常依赖大量标注数据、使用任务特定模型,在真实场景部署时存在大量泛化问题,且难以应对分布偏移。受自然语言处理(NLP)和计算机视觉(CV)研究领域中的网络规模、大容量预训练模型(即基础模型)在开放集表现和内容生成能力方面的显著成果启发,本综述致力于探讨:(i)NLP和CV中现有基础模型如何应用于机器人领域,以及(ii)针对机器人领域的基础模型应具有何种形态。我们首先概述传统机器人系统的构成要素及其实现通用化面临的根本障碍。随后建立分类体系,讨论当前利用现有基础模型服务机器人领域、或开发面向机器人应用的基础模型的研究工作。最后,我们探讨利用基础模型实现通用机器人系统的关键挑战与未来发展方向。我们鼓励读者关注我们持续更新的GitHub资源库,其中包含本综述涉及的相关论文、项目及用于开发机器人基础模型的开源资源。