Tool Learning with Foundation Models

Yujia Qin,Shengding Hu,Yankai Lin,Weize Chen,Ning Ding,Ganqu Cui,Zheni Zeng,Yufei Huang,Chaojun Xiao,Chi Han,Yi Ren Fung,Yusheng Su,Huadong Wang,Cheng Qian,Runchu Tian,Kunlun Zhu,Shihao Liang,Xingyu Shen,Bokai Xu,Zhen Zhang,Yining Ye,Bowen Li,Ziwei Tang,Jing Yi,Yuzhang Zhu,Zhenning Dai,Lan Yan,Xin Cong,Yaxi Lu,Weilin Zhao,Yuxiang Huang,Junxi Yan,Xu Han,Xian Sun,Dahai Li,Jason Phang,Cheng Yang,Tongshuang Wu,Heng Ji,Zhiyuan Liu,Maosong Sun

Humans possess an extraordinary ability to create and utilize tools, allowing them to overcome physical limitations and explore new frontiers. With the advent of foundation models, AI systems have the potential to be equally adept in tool use as humans. This paradigm, i.e., tool learning with foundation models, combines the strengths of specialized tools and foundation models to achieve enhanced accuracy, efficiency, and automation in problem-solving. Despite its immense potential, there is still a lack of a comprehensive understanding of key challenges, opportunities, and future endeavors in this field. To this end, we present a systematic investigation of tool learning in this paper. We first introduce the background of tool learning, including its cognitive origins, the paradigm shift of foundation models, and the complementary roles of tools and models. Then we recapitulate existing tool learning research into tool-augmented and tool-oriented learning. We formulate a general tool learning framework: starting from understanding the user instruction, models should learn to decompose a complex task into several subtasks, dynamically adjust their plan through reasoning, and effectively conquer each sub-task by selecting appropriate tools. We also discuss how to train models for improved tool-use capabilities and facilitate the generalization in tool learning. Considering the lack of a systematic tool learning evaluation in prior works, we experiment with 18 representative tools and show the potential of current foundation models in skillfully utilizing tools. Finally, we discuss several open problems that require further investigation for tool learning. Overall, we hope this paper could inspire future research in integrating tools with foundation models.

翻译：人类拥有非凡的创造和使用工具的能力，这使其能够突破生理限制并探索新的领域。随着基础模型的出现，人工智能系统在工具使用方面有望达到与人类相当的水平。这一范式，即基于基础模型的工具学习，融合了专业化工具与基础模型的优势，在问题求解中实现更高的准确性、效率和自动化。尽管该领域潜力巨大，但对其核心挑战、发展机遇及未来方向仍缺乏系统性认知。为此，本文对工具学习进行了系统性研究。我们首先介绍了工具学习的背景，包括其认知起源、基础模型的范式转变以及工具与模型的互补作用。随后将现有工具学习研究划分为工具增强学习和面向工具学习两类。我们提出了一个通用工具学习框架：从理解用户指令开始，模型需学会将复杂任务分解为若干子任务，通过推理动态调整计划，并选择适当工具有效攻克每个子任务。我们还探讨了如何训练模型以提升工具使用能力，并促进工具学习的泛化。鉴于此前研究缺乏系统性的工具学习评估，我们针对18种代表性工具进行实验，展示了当前基础模型在熟练运用工具方面的潜力。最后，我们讨论了工具学习中若干需要进一步研究的开放性问题。总体而言，希望本文能启发未来将工具与基础模型相结合的研究。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日