In recent years, the integration of large language models (LLMs) has revolutionized the field of robotics, enabling robots to communicate, understand, and reason with human-like proficiency. This paper explores the multifaceted impact of LLMs on robotics, addressing key challenges and opportunities for leveraging these models across various domains. By categorizing and analyzing LLM applications within core robotics elements -- communication, perception, planning, and control -- we aim to provide actionable insights for researchers seeking to integrate LLMs into their robotic systems. Our investigation focuses on LLMs developed post-GPT-3.5, primarily in text-based modalities while also considering multimodal approaches for perception and control. We offer comprehensive guidelines and examples for prompt engineering, facilitating beginners' access to LLM-based robotics solutions. Through tutorial-level examples and structured prompt construction, we illustrate how LLM-guided enhancements can be seamlessly integrated into robotics applications. This survey serves as a roadmap for researchers navigating the evolving landscape of LLM-driven robotics, offering a comprehensive overview and practical guidance for harnessing the power of language models in robotics development.
翻译:近年来,大型语言模型(LLMs)的融合为机器人学领域带来了革命性变革,使机器人能够以类人的熟练度进行交流、理解与推理。本文探讨了LLMs对机器人学的多维度影响,分析了在不同领域应用这些模型所面临的关键挑战与机遇。通过将LLM在机器人核心要素——交流、感知、规划与控制——中的应用进行分类与剖析,我们旨在为寻求将LLMs集成至机器人系统的研究者提供可操作的见解。本研究聚焦于GPT-3.5之后开发的LLMs,主要关注文本模态,同时兼顾感知与控制的多模态方法。我们提供了提示工程的综合指南与实例,以降低初学者使用基于LLM的机器人解决方案的门槛。通过教程级示例与结构化提示构建,我们阐释了LLM引导的增强功能如何无缝集成到机器人应用中。本综述为研究者探索LLM驱动机器人学的动态发展提供了路线图,并为在机器人开发中有效利用语言模型的能力提供了全面概览与实践指导。