ToolDreamer：将LLM推理能力融入工具检索器 (ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers)

Tool calling has become increasingly popular for Large Language Models (LLMs). However, for large tool sets, the resulting tokens would exceed the LLM's context window limit, making it impossible to include every tool. Hence, an external retriever is used to provide LLMs with the most relevant tools for a query. Existing retrieval models rank tools based on the similarity between a user query and a tool description (TD). This leads to suboptimal retrieval as user requests are often poorly aligned with the language of TD. To remedy the issue, we propose ToolDreamer, a framework to condition retriever models to fetch tools based on hypothetical (synthetic) TD generated using an LLM, i.e., description of tools that the LLM feels will be potentially useful for the query. The framework enables a more natural alignment between queries and tools within the language space of TD's. We apply ToolDreamer on the ToolRet dataset and show that our method improves the performance of sparse and dense retrievers with and without training, thus showcasing its flexibility. Through our proposed framework, our aim is to offload a portion of the reasoning burden to the retriever so that the LLM may effectively handle a large collection of tools without inundating its context window.

翻译：工具调用在大语言模型（LLM）中日益普及。然而，对于大规模工具集，若将所有工具描述纳入上下文，其产生的标记数将超出LLM的上下文窗口限制。因此，通常需要借助外部检索器为LLM提供与查询最相关的工具。现有检索模型基于用户查询与工具描述（TD）之间的相似度对工具进行排序，但由于用户请求往往与TD的语言表述存在差异，这种检索方式效果欠佳。为解决该问题，我们提出ToolDreamer框架，该框架通过LLM生成的假设性（合成）工具描述来引导检索模型获取工具——即LLM认为可能对查询有用的工具描述。该框架能够在TD的语言空间内实现查询与工具之间更自然的对齐。我们在ToolRet数据集上应用ToolDreamer，结果表明该方法能提升稀疏与稠密检索器在训练/非训练场景下的性能，展现了其灵活性。通过所提出的框架，我们的目标是将部分推理负担转移至检索器，使LLM能够有效处理大规模工具集合，同时避免上下文窗口过载。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日