In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN
翻译:近年来,以生成式预训练Transformer(GPT)为代表的大语言模型(LLMs)发展迅速,因其在各类实际应用中的潜力而日益受到关注。LLMs与具身智能的结合已成为一个重要的研究方向。在LLMs的众多应用中,导航任务尤为突出,因为它要求对环境的深入理解以及快速准确的决策。LLMs能够凭借其强大的语言和图像处理能力,为具身智能系统提供复杂的环境感知与决策支持。本文全面总结了LLMs与具身智能在导航领域的协同关系,综述了当前最先进的模型与研究方法,评估了现有具身导航模型与数据集的优劣势。最后,基于现有研究,本文阐明了LLMs在具身智能中的作用,并预测了该领域的未来发展方向。本综述所涉及的研究清单详见https://github.com/Rongtao-Xu/Awesome-LLM-EN