Recent advances in metric, semantic, and topological mapping have equipped autonomous robots with semantic concept grounding capabilities to interpret natural language tasks. This work aims to leverage these new capabilities with an efficient task planning algorithm for hierarchical metric-semantic models. We consider a scene graph representation of the environment and utilize a large language model (LLM) to convert a natural language task into a linear temporal logic (LTL) automaton. Our main contribution is to enable optimal hierarchical LTL planning with LLM guidance over scene graphs. To achieve efficiency, we construct a hierarchical planning domain that captures the attributes and connectivity of the scene graph and the task automaton, and provide semantic guidance via an LLM heuristic function. To guarantee optimality, we design an LTL heuristic function that is provably consistent and supplements the potentially inadmissible LLM guidance in multi-heuristic planning. We demonstrate efficient planning of complex natural language tasks in scene graphs of virtualized real environments.
翻译:近年来,度量、语义及拓扑建图技术的进步,使自主机器人具备了语义概念接地能力,可解读自然语言任务。本研究旨在利用这些新能力,提出一种面向层级度量-语义模型的高效任务规划算法。我们采用环境场景图表示,并利用大语言模型将自然语言任务转化为线性时序逻辑自动机。本文的主要贡献在于,实现了基于大语言模型引导的场景图最优层级线性时序逻辑规划。为提升效率,我们构建了捕获场景图属性、连通性及任务自动机的层级规划域,并通过大语言模型启发式函数提供语义引导。为保障最优性,我们设计了一个可证明一致性的线性时序逻辑启发式函数,在多启发式规划中补充可能存在不可接纳性的大语言模型引导。在虚拟化真实环境的场景图中,我们验证了复杂自然语言任务的高效规划性能。