Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extending the capability of LLMs. Although some works employ open-source LLMs for the tool learning task, most of them are trained in a controlled environment in which LLMs only learn to execute the human-provided tools. However, selecting proper tools from the large toolset is also a crucial ability for the tool learning model to be applied in real-world applications. Existing methods usually directly employ self-instruction methods to train the model, which ignores differences in tool complexity. In this paper, we propose the Confucius, a novel tool learning framework to train LLM to use complicated tools in real-world scenarios, which contains two main phases: (1) We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum; (2) thenceforth, we propose the Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically construct the dataset to improve the ability to use the complicated tool. Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in the real-world application scenarios compared to both tuning-free (e.g. ChatGPT, Claude) and tuning-based baselines (e.g. GPT4Tools).
翻译:增强大语言模型(LLMs)与外部工具的融合已成为拓展其能力的重要途径。尽管部分研究采用开源LLMs执行工具学习任务,但这些模型多在被控环境中训练,仅能学习使用人类提供的工具。然而,从庞大的工具集中选取恰当的工具,是工具学习模型应用于实际场景的关键能力。现有方法通常直接采用自指令训练方式,忽略了工具复杂度的差异性。本文提出"孔子"(Confucius)这一新型工具学习框架,旨在训练LLM在真实场景中运用复杂工具,包含两个核心阶段:(1)首先提出多阶段学习方法,通过易到难的课程体系教授LLM使用多样工具;(2)进而提出基于内省反馈的迭代自指令(ISIF)机制,通过动态构建数据集提升复杂工具的使用能力。在受控环境与真实场景下的大量实验表明,相较于免调优基线(如ChatGPT、Claude)和调优基线(如GPT4Tools),本工具学习框架在真实应用场景中均展现出显著优势。