Language-conditioned robotic manipulation represents a cutting-edge area of research, enabling seamless communication and cooperation between humans and robotic agents. This field focuses on teaching robotic systems to comprehend and execute instructions conveyed in natural language. To achieve this, the development of robust language understanding models capable of extracting actionable insights from textual input is essential. In this comprehensive survey, we systematically explore recent advancements in language-conditioned approaches within the context of robotic manipulation. We analyze these approaches based on their learning paradigms, which encompass reinforcement learning, imitation learning, and the integration of foundational models, such as large language models and vision-language models. Furthermore, we conduct an in-depth comparative analysis, considering aspects like semantic information extraction, environment & evaluation, auxiliary tasks, and task representation. Finally, we outline potential future research directions in the realm of language-conditioned learning for robotic manipulation, with the topic of generalization capabilities and safety issues. The GitHub repository of this paper can be found at https://github.com/hk-zh/language-conditioned-robot-manipulation-models
翻译:语言条件化的机器人操作代表了前沿研究领域,它使人类与机器人代理之间能够实现无缝通信与合作。该领域专注于教会机器人系统理解和执行自然语言指令。为此,开发能够从文本输入中提取可操作见解的鲁棒语言理解模型至关重要。在这项全面综述中,我们系统性地探讨了在机器人操作背景下语言条件化方法的最新进展。我们基于其学习范式对这些方法进行了分析,这些范式涵盖强化学习、模仿学习以及基础模型(如大型语言模型和视觉语言模型)的整合。此外,我们从语义信息提取、环境与评估、辅助任务以及任务表示等方面进行了深入的比较分析。最后,我们概述了语言条件化机器人操作学习领域未来潜在的研究方向,重点涉及泛化能力和安全问题。本文的GitHub仓库地址为:https://github.com/hk-zh/language-conditioned-robot-manipulation-models