Researchers and practitioners have recently reframed powerful Large Language Models (LLMs) as agents, enabling them to automate complex tasks largely via the use of specialized functions. To facilitate the development of LLM agents, we present a novel paradigm of training LLM agents without modifying the LLM weights, which is particularly useful when the LLMs are difficult or inaccessible for modifications. Inspired by how humans continuously forge tools to adapt to real-world tasks, rather than change our biological structure to fit a static set of tools, we propose to progressively forge agent's functions to better solve the downstream tasks instead of modifying the LLM weights. By treating the functions as learnable `agent parameters' and leveraging the fundamental idea of model training in artificial intelligence, we develop AgentOptimizer that employs the LLM to update agents' functions and devise an agent training algorithm with two strategies, roll-back, and early-stop, to streamline the training process. With extensive experiments, we showcase that the agent training paradigm could significantly improve the performance of representative LLM agents in various downstream tasks. We also study the behavior of the agent training regarding aspects like the learning curve and domain transferability.
翻译:研究人员和从业者近来将强大的大语言模型(LLM)重新定位为智能体,使其能够主要通过专用函数实现复杂任务的自动化。为促进LLM智能体的开发,本文提出一种无需修改LLM权重的智能体训练新范式,该范式在LLM难以修改或无法访问时尤为实用。受人类通过持续锻造工具以适应现实任务(而非改变自身生物结构来适应静态工具集)的启发,我们提出通过逐步锻造智能体的函数来优化下游任务求解,而非直接修改LLM权重。通过将函数视为可学习的“智能体参数”,并借鉴人工智能中模型训练的基本思想,我们开发了AgentOptimizer——该系统利用LLM更新智能体函数,并设计了包含回滚与早停两种策略的智能体训练算法以优化训练流程。大量实验表明,该智能体训练范式能显著提升代表性LLM智能体在各类下游任务中的性能。我们还从学习曲线、领域可迁移性等方面研究了智能体训练的行为特性。