In recent years, the integration of large language models (LLMs) has revolutionized the field of robotics, enabling robots to communicate, understand, and reason with human-like proficiency. This paper explores the multifaceted impact of LLMs on robotics, addressing key challenges and opportunities for leveraging these models across various domains. By categorizing and analyzing LLM applications within core robotics elements -- communication, perception, planning, and control -- we aim to provide actionable insights for researchers seeking to integrate LLMs into their robotic systems. Our investigation focuses on LLMs developed post-GPT-3.5, primarily in text-based modalities while also considering multimodal approaches for perception and control. We offer comprehensive guidelines and examples for prompt engineering, facilitating beginners' access to LLM-based robotics solutions. Through tutorial-level examples and structured prompt construction, we illustrate how LLM-guided enhancements can be seamlessly integrated into robotics applications. This survey serves as a roadmap for researchers navigating the evolving landscape of LLM-driven robotics, offering a comprehensive overview and practical guidance for harnessing the power of language models in robotics development.
翻译:近年来,大语言模型(LLMs)的集成彻底改变了机器人领域,使机器人能够以类人的熟练程度进行通信、理解和推理。本文探讨了大语言模型对机器人技术的多方面影响,阐述了在这些模型跨领域应用中的关键挑战与机遇。通过分类并分析大语言模型在机器人核心要素(通信、感知、规划与控制)中的应用,我们旨在为寻求将大语言模型集成到机器人系统的研究人员提供切实可行的见解。我们的研究聚焦于GPT-3.5之后开发的大语言模型,主要涉及基于文本的模态,同时也考虑了用于感知与控制的多模态方法。我们提供了关于提示工程的全面指南与示例,便于初学者接触基于大语言模型的机器人解决方案。通过教程级示例与结构化提示构建,我们展示了如何将大语言模型引导的增强功能无缝集成到机器人应用中。本综述为研究人员在不断发展的大语言模型驱动机器人领域导航提供了路线图,全面概述了如何在机器人开发中利用语言模型的强大能力,并给出了实用指导。