Meta-Reasoning Improves Tool Use in Large Language Models

External tools help large language models (LLMs) succeed at tasks where they would otherwise typically fail. In existing frameworks, LLMs learn tool use either by in-context demonstrations or via full model fine-tuning on annotated data. As these approaches do not easily scale, a recent trend is to abandon them in favor of lightweight, parameter-efficient tuning paradigms. These methods allow quickly alternating between the frozen LLM and its specialised fine-tuned version, by switching on or off a handful of additional custom parameters. Hence, we postulate that the generalization ability of the frozen model can be leveraged to improve tool selection. We present Tool selECTion via meta-reasONing (TECTON), a two-phase system that first reasons over a task using a custom fine-tuned LM head and outputs candidate tools. Then, with the custom head disabled, it meta-reasons (i.e., it reasons over the previous reasoning process) to make a final choice. We show that TECTON results in substantial gains - both in-distribution and out-of-distribution - on a range of math reasoning datasets.

翻译：外部工具能够帮助大型语言模型（LLMs）在原本通常会失败的任务上取得成功。在现有框架中，LLMs通过上下文演示或在标注数据上进行完整模型微调来学习工具使用。由于这些方法难以扩展，近期趋势是放弃这些方法，转而采用轻量级、参数高效的调优范式。这些方法允许通过启用或禁用少量额外的自定义参数，在冻结的LLM与其专用微调版本之间快速切换。因此，我们假设可以利用冻结模型的泛化能力来改进工具选择。我们提出了基于元推理的工具选择系统（TECTON），这是一个两阶段系统：首先通过自定义的微调语言模型头部对任务进行推理并输出候选工具；随后在禁用自定义头部的情况下，通过元推理（即对先前推理过程进行再推理）做出最终选择。我们在多个数学推理数据集上证明，TECTON在分布内和分布外场景中均能带来显著性能提升。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日