Large language models (LLMs) are increasingly employed for complex multi-step planning tasks, where the tool retrieval (TR) step is crucial for achieving successful outcomes. Two prevalent approaches for TR are single-step retrieval, which utilizes the complete query, and sequential retrieval using task decomposition (TD), where a full query is segmented into discrete atomic subtasks. While single-step retrieval lacks the flexibility to handle "inter-tool dependency," the TD approach necessitates maintaining "subtask-tool atomicity alignment," as the toolbox can evolve dynamically. To address these limitations, we introduce the Progressive Tool retrieval to Improve Planning (ProTIP) framework. ProTIP is a lightweight, contrastive learning-based framework that implicitly performs TD without the explicit requirement of subtask labels, while simultaneously maintaining subtask-tool atomicity. On the ToolBench dataset, ProTIP outperforms the ChatGPT task decomposition-based approach by a remarkable margin, achieving a 24% improvement in Recall@K=10 for TR and a 41% enhancement in tool accuracy for plan generation.
翻译:大型语言模型(LLM)越来越多地被用于复杂的多步骤规划任务,其中工具检索(TR)步骤对于获得成功结果至关重要。两种主流的TR方法是单步检索(利用完整查询)和基于任务分解(TD)的序列检索(将完整查询分割为离散的原子性子任务)。单步检索缺乏处理“工具间依赖”的灵活性,而TD方法则需要维护“子任务-工具原子性对齐”,因为工具集可能动态演化。为解决这些局限,我们提出了渐进式工具检索提升规划(ProTIP)框架。ProTIP是一个基于对比学习的轻量级框架,它隐式地执行任务分解(无需显式的子任务标签),同时维护子任务-工具原子性。在ToolBench数据集上,ProTIp显著优于基于ChatGPT的任务分解方法,在TR的Recall@K=10指标上实现了24%的提升,并在规划生成的工具准确率上实现了41%的改进。