In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities in understanding and solving mathematical problems, leading to advancements in various fields. We propose an LLM-embodied path planning framework for mobile agents, focusing on solving high-level coverage path planning issues and low-level control. Our proposed multi-layer architecture uses prompted LLMs in the path planning phase and integrates them with the mobile agents' low-level actuators. To evaluate the performance of various LLMs, we propose a coverage-weighted path planning metric to assess the performance of the embodied models. Our experiments show that the proposed framework improves LLMs' spatial inference abilities. We demonstrate that the proposed multi-layer framework significantly enhances the efficiency and accuracy of these tasks by leveraging the natural language understanding and generative capabilities of LLMs. Our experiments show that this framework can improve LLMs' 2D plane reasoning abilities and complete coverage path planning tasks. We also tested three LLM kernels: gpt-4o, gemini-1.5-flash, and claude-3.5-sonnet. The experimental results show that claude-3.5 can complete the coverage planning task in different scenarios, and its indicators are better than those of the other models.
翻译:近年来,大语言模型(LLMs)在理解和解决数学问题方面展现出卓越能力,推动了多个领域的进步。我们提出了一种面向移动智能体的LLM具身路径规划框架,重点解决高层覆盖路径规划问题与底层控制。我们提出的多层架构在路径规划阶段使用提示式大语言模型,并将其与移动智能体的底层执行器相集成。为了评估不同大语言模型的性能,我们提出了一种覆盖加权的路径规划度量标准来评估具身模型的性能。实验表明,所提出的框架提升了大语言模型的空间推理能力。我们证明,通过利用大语言模型的自然语言理解和生成能力,所提出的多层架构显著提升了这些任务的效率和准确性。实验显示该框架能够改善大语言模型的二维平面推理能力并完成覆盖路径规划任务。我们还测试了三种大语言模型内核:gpt-4o、gemini-1.5-flash和claude-3.5-sonnet。实验结果表明,claude-3.5能够在不同场景下完成覆盖规划任务,且其各项指标优于其他模型。