Embodied robots which can interact with their environment and neighbours are increasingly being used as a test case to develop Artificial Intelligence. This creates a need for multimodal robot controllers that can operate across different types of information, including text. Large Language Models are able to process and generate textual as well as audiovisual data and, more recently, robot actions. Language Models are increasingly being applied to robotic systems; these Language-Based robots leverage the power of language models in a variety of ways. Additionally, the use of language opens up multiple forms of information exchange between members of a human-robot team. This survey motivates the use of language models in robotics, and then delineates works based on the part of the overall control flow in which language is incorporated. Language can be used by human to task a robot, by a robot to inform a human, between robots as a human-like communication medium, and internally for a robot's planning and control. Applications of language-based robots are explored, and numerous limitations and challenges are discussed to provide a summary of the development needed for the future of language-based robotics.
翻译:能够与环境及邻近个体交互的具身机器人正日益成为人工智能发展的重要测试平台。这催生了能够处理多种模态信息(包括文本)的机器人控制器的需求。大语言模型已具备处理与生成文本、视听数据的能力,近期更拓展至机器人动作序列的生成。语言模型在机器人系统中的运用日益广泛;这类基于语言的机器人以多种方式利用语言模型的强大能力。此外,语言的使用为人机团队成员间开启了多元化的信息交换模式。本综述首先阐释语言模型在机器人学中的应用动机,继而依据语言在整体控制流中的融合环节对现有工作进行系统分类:语言可被人类用于向机器人下达指令,被机器人用于向人类传达信息,在机器人间作为类人通信媒介,以及在机器人内部用于规划与控制。本文深入探讨了基于语言的机器人应用场景,并剖析了诸多局限性与挑战,从而为基于语言的机器人学未来发展所需的关键技术演进提供了系统性总结。