Recently, tool learning with large language models (LLMs) has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems. Despite growing attention and rapid advancements in this field, the existing literature remains fragmented and lacks systematic organization, posing barriers to entry for newcomers. This gap motivates us to conduct a comprehensive survey of existing works on tool learning with LLMs. In this survey, we focus on reviewing existing literature from the two primary aspects (1) why tool learning is beneficial and (2) how tool learning is implemented, enabling a comprehensive understanding of tool learning with LLMs. We first explore the "why" by reviewing both the benefits of tool integration and the inherent benefits of the tool learning paradigm from six specific aspects. In terms of "how", we systematically review the literature according to a taxonomy of four key stages in the tool learning workflow: task planning, tool selection, tool calling, and response generation. Additionally, we provide a detailed summary of existing benchmarks and evaluation methods, categorizing them according to their relevance to different stages. Finally, we discuss current challenges and outline potential future directions, aiming to inspire both researchers and industrial developers to further explore this emerging and promising area. We also maintain a GitHub repository to continually keep track of the relevant papers and resources in this rising area at https://github.com/quchangle1/LLM-Tool-Survey.
翻译:近期,基于大型语言模型(LLM)的工具学习已成为一种增强LLM处理高度复杂问题能力的前沿范式。尽管该领域受到日益增长的关注并取得快速发展,现有文献仍较为零散且缺乏系统性梳理,为初学者进入该领域带来了障碍。这一现状促使我们对LLM工具学习的现有研究进行全面综述。本综述聚焦于从两个核心维度梳理现有文献:(1)工具学习的必要性;(2)工具学习的实现路径,以建立对LLM工具学习的系统性认知。首先,我们从工具集成的优势与工具学习范式内在价值两个层面,通过六个具体维度探讨“为何需要工具学习”。针对“如何实现”,我们依据工具学习工作流中的四个关键阶段——任务规划、工具选择、工具调用与响应生成——构建分类体系,并据此系统评述相关文献。此外,我们详细总结了现有基准测试与评估方法,并按其与不同阶段的关联性进行分类归纳。最后,我们探讨当前面临的挑战并展望未来潜在研究方向,旨在启发学术界与工业界的研究者进一步探索这一新兴领域。我们同时维护了GitHub项目(https://github.com/quchangle1/LLM-Tool-Survey),持续追踪该领域的相关论文与资源。