Large Language Models (LLMs) have demonstrated impressive planning abilities due to their vast "world knowledge". Yet, obtaining plans that are both feasible (grounded in affordances) and cost-effective (in plan length), remains a challenge, despite recent progress. This contrasts with heuristic planning methods that employ domain knowledge (formalized in action models such as PDDL) and heuristic search to generate feasible, optimal plans. Inspired by this, we propose to combine the power of LLMs and heuristic planning by leveraging the world knowledge of LLMs and the principles of heuristic search. Our approach, SayCanPay, employs LLMs to generate actions (Say) guided by learnable domain knowledge, that evaluates actions' feasibility (Can) and long-term reward/payoff (Pay), and heuristic search to select the best sequence of actions. Our contributions are (1) a novel framing of the LLM planning problem in the context of heuristic planning, (2) integrating grounding and cost-effective elements into the generated plans, and (3) using heuristic search over actions. Our extensive evaluations show that our model surpasses other LLM planning approaches.
翻译:大型语言模型凭借其丰富的“世界知识”展现出令人瞩目的规划能力。然而,尽管近期已取得进展,生成既具备可行性(基于可操作性)又具有成本效益(在规划长度方面)的方案仍是一项挑战。这与采用领域知识(形式化为PDDL等动作模型)并通过启发式搜索生成可行最优方案的启发式规划方法形成鲜明对比。受此启发,我们提出通过结合大型语言模型的世界知识与启发式搜索原理,融合二者优势。本方法SayCanPay采用大型语言模型生成动作(Say),受可学习领域知识引导,评估动作的可行性(Can)与长期回报/收益(Pay),并通过启发式搜索选取最优动作序列。我们的贡献在于:(1)在启发式规划框架下对大型语言模型规划问题进行全新定义;(2)将可操作性与成本效益要素融入生成方案;(3)对动作序列实施启发式搜索。广泛评估表明,本模型超越其他基于大型语言模型的规划方法。