In recent years, the integration of large language models (LLMs) has revolutionized the field of robotics, enabling robots to communicate, understand, and reason with human-like proficiency. This paper explores the multifaceted impact of LLMs on robotics, addressing key challenges and opportunities for leveraging these models across various domains. By categorizing and analyzing LLM applications within core robotics elements -- communication, perception, planning, and control -- we aim to provide actionable insights for researchers seeking to integrate LLMs into their robotic systems. Our investigation focuses on LLMs developed post-GPT-3.5, primarily in text-based modalities while also considering multimodal approaches for perception and control. We offer comprehensive guidelines and examples for prompt engineering, facilitating beginners' access to LLM-based robotics solutions. Through tutorial-level examples and structured prompt construction, we illustrate how LLM-guided enhancements can be seamlessly integrated into robotics applications. This survey serves as a roadmap for researchers navigating the evolving landscape of LLM-driven robotics, offering a comprehensive overview and practical guidance for harnessing the power of language models in robotics development.
翻译:近年来,大型语言模型(LLMs)的融合为机器人学领域带来了革命性变革,使机器人能够以类人的熟练度进行交流、理解与推理。本文探讨了LLMs对机器人学的多维影响,分析了在不同领域利用这些模型所面临的关键挑战与机遇。通过将LLM在机器人核心要素——交流、感知、规划与控制——中的应用进行分类与解析,我们旨在为寻求将LLMs集成至机器人系统的研究者提供可操作的见解。我们的研究聚焦于GPT-3.5之后开发的LLMs,主要关注基于文本的模态,同时兼顾感知与控制的多模态方法。我们为提示工程提供了全面的指导原则与实例,以帮助初学者接触基于LLM的机器人解决方案。通过教程级示例与结构化提示构建,我们阐释了LLM引导的增强功能如何无缝集成到机器人应用中。本综述为探索LLM驱动机器人学这一不断演进领域的研究者提供了路线图,为在机器人开发中利用语言模型的力量提供了全面概览与实践指导。