SPINE：非结构化环境中不完全自然语言任务描述的在线语义规划 (SPINE: Online Semantic Planning for Missions with Incomplete Natural Language Specifications in Unstructured Environments)

As robots become increasingly capable, users will want to describe high-level missions and have robots fill in the gaps. In many realistic settings, pre-built maps are difficult to obtain, so execution requires exploration and mapping that are necessary and specific to the mission. Consider an emergency response scenario where a user commands a robot, "triage impacted regions." The robot must infer relevant semantics (victims, etc.) and exploration targets (damaged regions) based on priors or other context, then explore and refine its plan online. These missions are incompletely specified, meaning they imply subtasks and semantics. While many semantic planning methods operate online, they are typically designed for well specified tasks such as object search or exploration. Recently, Large Language Models (LLMs) have demonstrated powerful contextual reasoning over a range of robotic tasks described in natural language. However, existing LLM planners typically do not consider online planning or complex missions; rather, relevant subtasks are provided by a pre-built map or a user. We address these limitations via SPINE (online Semantic Planner for missions with Incomplete Natural language specifications in unstructured Environments). SPINE uses an LLM to reason about subtasks implied by the mission then realizes these subtasks in a receding horizon framework. Tasks are automatically validated for safety and refined online with new observations. We evaluate SPINE in simulation and real-world settings. Evaluation missions require multiple steps of semantic reasoning and exploration in cluttered outdoor environments of over 20,000m$^2$ area. We evaluate SPINE against competitive baselines in single-agent and air-ground teaming applications. Please find videos and software on our project page: https://zacravichandran.github.io/SPINE

翻译：随着机器人能力日益增强，用户将倾向于描述高层级任务，并由机器人填补执行细节。在许多现实场景中，难以获取预构建地图，因此任务执行需要开展与任务特性紧密相关的探索与建图。以应急响应场景为例，当用户向机器人下达"对受灾区域进行伤员分诊"指令时，机器人必须依据先验知识或上下文推断相关语义要素（伤员等）与探索目标（受损区域），继而在线开展探索并动态优化规划方案。此类任务具有不完全指定性，即隐含了子任务与语义结构。现有语义规划方法虽能在线运行，但通常针对目标搜索或区域探索等明确指定的任务设计。近期，大语言模型（LLMs）在处理自然语言描述的各类机器人任务中展现出强大的情境推理能力。然而，现有基于LLM的规划器通常未考虑在线规划或复杂任务场景，其相关子任务仍需依赖预构建地图或用户提供。为突破这些局限，我们提出SPINE（非结构化环境中不完全自然语言任务描述的在线语义规划器）。SPINE利用LLM推理任务隐含的子任务结构，并在滚动时域框架中实现这些子任务。系统在线自动验证任务安全性，并依据新观测数据动态优化方案。我们在仿真与真实场景中对SPINE进行评估，测试任务要求在超过20,000平方米的杂乱户外环境中进行多层级语义推理与探索。通过单智能体及空地协同应用场景与基准方法的对比实验，验证了SPINE的优越性能。相关视频与软件请访问项目页面：https://zacravichandran.github.io/SPINE