The Rise and Potential of Large Language Model Based Agents: A Survey

Zhiheng Xi,Wenxiang Chen,Xin Guo,Wei He,Yiwen Ding,Boyang Hong,Ming Zhang,Junzhe Wang,Senjie Jin,Enyu Zhou,Rui Zheng,Xiaoran Fan,Xiao Wang,Limao Xiong,Qin Liu,Yuhao Zhou,Weiran Wang,Changhao Jiang,Yicheng Zou,Xiangyang Liu,Zhangyue Yin,Shihan Dou,Rongxiang Weng,Wensen Cheng,Qi Zhang,Wenjuan Qin,Yongyan Zheng,Xipeng Qiu,Xuanjing Huan,Tao Gui

from arxiv, 86 pages, 12 figures

For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent AI agents since the mid-20th century. However, these efforts have mainly focused on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the community lacks is a sufficiently general and powerful model to serve as a starting point for designing AI agents that can adapt to diverse scenarios. Due to the versatile and remarkable capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI), offering hope for building general AI agents. Many research efforts have leveraged LLMs as the foundation to build AI agents and have achieved significant progress. We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for AI agents. Building upon this, we present a conceptual framework for LLM-based agents, comprising three main components: brain, perception, and action, and the framework can be tailored to suit different applications. Subsequently, we explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. Following this, we delve into agent societies, exploring the behavior and personality of LLM-based agents, the social phenomena that emerge when they form societies, and the insights they offer for human society. Finally, we discuss a range of key topics and open problems within the field.

翻译：长久以来，人类一直追求达到或超越人类水平的人工智能（AI），而AI智能体被视为实现这一目标的重要载体。AI智能体是能够感知环境、做出决策并采取行动的人工实体。自20世纪中期以来，研究人员在开发智能AI智能体方面投入了大量努力。然而，这些努力主要侧重于通过算法或训练策略的改进来提升特定任务中的具体能力或表现。实际上，该领域所缺乏的是一个足够通用且强大的模型，能够作为设计适应多样化场景的AI智能体的起点。凭借其卓越的多功能性，大语言模型（LLMs）被视为实现通用人工智能（AGI）的潜在火花，为构建通用AI智能体带来了希望。众多研究工作已利用LLMs作为基础构建AI智能体，并取得了显著进展。我们从哲学起源追溯智能体的概念，梳理其在AI领域的发展脉络，阐释LLMs为何适合作为AI智能体的基础。基于此，我们提出一个包含大脑、感知和行动三大核心组件的LLM智能体概念框架，该框架可根据不同应用场景进行定制。随后，我们从单智能体、多智能体以及人机协作三个维度探索了LLM智能体的广泛应用。接着，我们深入分析智能体社会，探讨LLM智能体的行为与人格特性、群体互动中涌现的社会现象，以及这些现象为人类社会提供的启示。最后，我们讨论了该领域的若干关键议题与开放性问题。