SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models

Despite significant advancements in large language models (LLMs) that enhance robot agents' understanding and execution of natural language (NL) commands, ensuring the agents adhere to user-specified constraints remains challenging, particularly for complex commands and long-horizon tasks. To address this challenge, we present three key insights, equivalence voting, constrained decoding, and domain-specific fine-tuning, which significantly enhance LLM planners' capability in handling complex tasks. Equivalence voting ensures consistency by generating and sampling multiple Linear Temporal Logic (LTL) formulas from NL commands, grouping equivalent LTL formulas, and selecting the majority group of formulas as the final LTL formula. Constrained decoding then uses the generated LTL formula to enforce the autoregressive inference of plans, ensuring the generated plans conform to the LTL. Domain-specific fine-tuning customizes LLMs to produce safe and efficient plans within specific task domains. Our approach, Safe Efficient LLM Planner (SELP), combines these insights to create LLM planners to generate plans adhering to user commands with high confidence. We demonstrate the effectiveness and generalizability of SELP across different robot agents and tasks, including drone navigation and robot manipulation. For drone navigation tasks, SELP outperforms state-of-the-art planners by 10.8% in safety rate (i.e., finishing tasks conforming to NL commands) and by 19.8% in plan efficiency. For robot manipulation tasks, SELP achieves 20.4% improvement in safety rate. Our datasets for evaluating NL-to-LTL and robot task planning will be released in github.com/lt-asset/selp.

翻译：尽管大语言模型（LLM）在增强机器人智能体对自然语言（NL）指令的理解与执行方面取得了显著进展，但确保智能体遵循用户指定的约束条件仍具挑战性，尤其对于复杂指令和长时程任务。为应对这一挑战，我们提出了三项关键见解：等价投票、约束解码和领域特定微调，这些方法显著提升了LLM规划器处理复杂任务的能力。等价投票通过从自然语言指令生成并采样多个线性时序逻辑（LTL）公式，对等价LTL公式进行分组，并选择多数组的公式作为最终LTL公式，从而确保一致性。约束解码随后利用生成的LTL公式来约束规划的自回归推理过程，确保生成的规划符合LTL规范。领域特定微调则针对特定任务领域定制LLM，以生成安全高效的规划。我们的方法——安全高效大语言模型规划器（SELP）——整合了这些见解，构建出能够以高置信度生成符合用户指令的规划的LLM规划器。我们在无人机导航和机器人操作等多种机器人智能体与任务中验证了SELP的有效性和泛化能力。在无人机导航任务中，SELP在安全率（即符合自然语言指令完成任务的比率）上优于最先进的规划器10.8%，在规划效率上提升19.8%。在机器人操作任务中，SELP的安全率提升了20.4%。我们用于评估自然语言到LTL转换及机器人任务规划的数据集将在github.com/lt-asset/selp发布。