In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN
翻译:近年来,生成式预训练Transformer(GPT)等大语言模型(LLMs)的快速发展因其在多种实际应用中的潜力而日益受到关注。将LLMs与具身智能相结合已成为一个重要的研究领域。在LLMs的众多应用中,导航任务尤为突出,因为这类任务要求对环境的深度理解以及快速准确的决策能力。LLMs可凭借其强大的语言与图像处理能力,为具身智能系统提供复杂的环境感知与决策支持。本文全面总结了LLMs与具身智能在导航任务中的协同关系,评述了当前最先进的模型与研究方法,并评估了现有具身导航模型与数据集的优缺点。最后,基于现有研究阐明了LLMs在具身智能中的作用,并展望了该领域的未来发展方向。本文综述的完整研究列表可参见https://github.com/Rongtao-Xu/Awesome-LLM-EN。