Planning under resource constraints is central to real-world decision making, yet most large language model (LLM) planners assume uniform action costs. We systematically analyze whether tree-search LLM planners are cost-aware and whether they efficiently generate budget-feasible plans. In contrast to black-box prompting, explicit search trees expose intermediate decisions, node evaluations, and failure modes, which allows for controlled ablations of planner behavior. We study depth-first search, breadth-first search, Monte Carlo Tree Search, and bidirectional search within a unified framework. Our experiments show that existing tree-based LLM planners often struggle to find cost-optimal plans, and that additional search computation does not reliably improve optimality. Among the methods evaluated, bidirectional search achieves the best overall efficiency and success rate. MCTS achieves the highest optimality on short-horizon tasks. Tree-search planners are especially valuable for studying LLM planning because their reasoning steps are explicit, in contrast to plain LLMs that internalize planning dynamics through post-training trajectories. Our findings suggest that improving LLM planning under resource constraints will likely require new search algorithms, rather than solely scaling inference-time compute.
翻译:资源约束下的规划是现实世界决策的核心,然而大多数大语言模型(LLM)规划器都假设行动成本是均匀的。我们系统地分析了树搜索LLM规划器是否具有成本感知能力,以及它们是否能高效地生成预算可行的计划。与黑盒提示方法不同,显式的搜索树暴露了中间决策、节点评估和失败模式,这使得我们可以对规划器的行为进行受控的消融研究。我们在一个统一的框架内研究了深度优先搜索、广度优先搜索、蒙特卡洛树搜索和双向搜索。我们的实验表明,现有的基于树的LLM规划器通常难以找到成本最优的计划,并且额外的搜索计算并不能可靠地提高最优性。在所评估的方法中,双向搜索实现了最佳的整体效率和成功率。蒙特卡洛树搜索在短视距任务上实现了最高的最优性。树搜索规划器对于研究LLM规划特别有价值,因为它们的推理步骤是显式的,这与通过训练后轨迹内化规划动态的普通LLM形成对比。我们的研究结果表明,在资源约束下改进LLM规划可能需要新的搜索算法,而不仅仅是扩展推理时的计算资源。