Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extending the capability of LLMs. Although some works employ open-source LLMs for the tool learning task, most of them are trained in a controlled environment in which LLMs only learn to execute the human-provided tools. However, selecting proper tools from the large toolset is also a crucial ability for the tool learning model to be applied in real-world applications. Existing methods usually directly employ self-instruction methods to train the model, which ignores differences in tool complexity. In this paper, we propose the Confucius, a novel tool learning framework to train LLM to use complicated tools in real-world scenarios, which contains two main phases: (1) We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum; (2) thenceforth, we propose the Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically construct the dataset to improve the ability to use the complicated tool. Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in the real-world application scenarios compared to both tuning-free (e.g. ChatGPT, Claude) and tuning-based baselines (e.g. GPT4Tools).

翻译：将大型语言模型与外部工具相结合已成为扩展其能力的一种有前景的方法。尽管已有工作采用开源大语言模型进行工具学习任务，但大多数模型在受控环境中训练，仅学习执行人类提供的工具。然而，从大规模工具集中选取合适工具是工具学习模型应用于实际场景的关键能力。现有方法通常直接采用自我指导方法训练模型，忽视了工具复杂度的差异。本文提出孔子框架，一种新型工具学习方法，旨在训练大语言模型在真实场景中使用复杂工具，包含两个主要阶段：（1）首先提出多阶段学习方法，通过由易到难的课程式教学让大语言模型掌握各类工具；（2）随后提出基于内省反馈的迭代式自我指导方法，动态构建数据集以提升复杂工具使用能力。在受控环境和真实场景下的大量实验表明，与免微调基线（如ChatGPT、Claude）和微调基线（如GPT4Tools）相比，本工具学习框架在真实应用场景中具有显著优越性。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日