The AI community has been exploring a pathway to artificial general intelligence (AGI) by developing "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that they are model-centric, or engineering-centric. That's to say, the progress on prompts, tools, and pipelines of language agents requires substantial manual engineering efforts from human experts rather than automatically learning from data. We believe the transition from model-centric, or engineering-centric, to data-centric, i.e., the ability of language agents to autonomously learn and evolve in environments, is the key for them to possibly achieve AGI. In this work, we introduce agent symbolic learning, a systematic framework that enables language agents to optimize themselves on their own in a data-centric way using symbolic optimizers. Specifically, we consider agents as symbolic networks where learnable weights are defined by prompts, tools, and the way they are stacked together. Agent symbolic learning is designed to optimize the symbolic network within language agents by mimicking two fundamental algorithms in connectionist learning: back-propagation and gradient descent. Instead of dealing with numeric weights, agent symbolic learning works with natural language simulacrums of weights, loss, and gradients. We conduct proof-of-concept experiments on both standard benchmarks and complex real-world tasks and show that agent symbolic learning enables language agents to update themselves after being created and deployed in the wild, resulting in "self-evolving agents".
翻译:人工智能社区一直在探索通往通用人工智能(AGI)的路径,通过开发"语言智能体"——即涉及提示技术和工具使用方法的复杂大型语言模型(LLM)流程。尽管语言智能体已在众多现实任务中展现出令人印象深刻的能力,但当前语言智能体研究的一个根本局限在于其以模型为中心或以工程为中心的特性。这意味着,语言智能体在提示、工具及流程方面的进展需要人类专家投入大量手动工程努力,而非从数据中自动学习。我们认为,从以模型/工程为中心向以数据为中心转变——即语言智能体在环境中自主学习和进化的能力——是其可能实现AGI的关键。本工作中,我们提出智能体符号学习这一系统框架,使语言智能体能够利用符号优化器以数据为中心的方式自主优化自身。具体而言,我们将智能体视为符号网络,其中可学习的权重由提示、工具及其组合方式定义。智能体符号学习旨在通过模拟联结主义学习中的两种基本算法——反向传播与梯度下降——来优化语言智能体内部的符号网络。与处理数值权重不同,智能体符号学习操作的是权重、损失和梯度的自然语言模拟物。我们在标准基准测试和复杂现实任务上进行了概念验证实验,结果表明智能体符号学习使语言智能体在创建并部署于实际环境后能够自我更新,从而实现"自进化智能体"。