External tools help large language models (LLMs) succeed at tasks where they would otherwise typically fail. In existing frameworks, LLMs learn tool use either by in-context demonstrations or via full model fine-tuning on annotated data. As these approaches do not easily scale, a recent trend is to abandon them in favor of lightweight, parameter-efficient tuning paradigms. These methods allow quickly alternating between the frozen LLM and its specialised fine-tuned version, by switching on or off a handful of additional custom parameters. Hence, we postulate that the generalization ability of the frozen model can be leveraged to improve tool selection. We present Tool selECTion via meta-reasONing (TECTON), a two-phase system that first reasons over a task using a custom fine-tuned LM head and outputs candidate tools. Then, with the custom head disabled, it meta-reasons (i.e., it reasons over the previous reasoning process) to make a final choice. We show that TECTON results in substantial gains - both in-distribution and out-of-distribution - on a range of math reasoning datasets.
翻译:外部工具能够帮助大型语言模型(LLMs)在原本通常会失败的任务上取得成功。在现有框架中,LLMs通过上下文演示或在标注数据上进行完整模型微调来学习工具使用。由于这些方法难以扩展,近期趋势是放弃这些方法,转而采用轻量级、参数高效的调优范式。这些方法允许通过启用或禁用少量额外的自定义参数,在冻结的LLM与其专用微调版本之间快速切换。因此,我们假设可以利用冻结模型的泛化能力来改进工具选择。我们提出了基于元推理的工具选择系统(TECTON),这是一个两阶段系统:首先通过自定义的微调语言模型头部对任务进行推理并输出候选工具;随后在禁用自定义头部的情况下,通过元推理(即对先前推理过程进行再推理)做出最终选择。我们在多个数学推理数据集上证明,TECTON在分布内和分布外场景中均能带来显著性能提升。