Recent efforts have augmented large language models (LLMs) with external resources (e.g., the Internet) or internal control flows (e.g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents. While these agents have achieved substantial empirical success, we lack a systematic framework to organize existing agents and plan future developments. In this paper, we draw on the rich history of cognitive science and symbolic artificial intelligence to propose Cognitive Architectures for Language Agents (CoALA). CoALA describes a language agent with modular memory components, a structured action space to interact with internal memory and external environments, and a generalized decision-making process to choose actions. We use CoALA to retrospectively survey and organize a large body of recent work, and prospectively identify actionable directions towards more capable agents. Taken together, CoALA contextualizes today's language agents within the broader history of AI and outlines a path towards language-based general intelligence.
翻译:近期研究通过为大型语言模型(LLMs)整合外部资源(如互联网)或内部控制流(如提示链)来增强其处理需要具身或推理任务的能力,由此催生了一类新型语言智能体。尽管这些智能体已取得显著实证成功,但现有工作缺乏系统性框架来组织已有智能体并规划未来发展。本文借鉴认知科学与符号人工智能的深厚积淀,提出语言智能体认知架构(CoALA)。CoALA将语言智能体描述为包含模块化记忆组件、可与内部记忆及外部环境交互的结构化动作空间,以及用于选择动作的通用决策过程的系统。我们运用CoALA对近期大量工作进行回顾性梳理与归整,并前瞻性地识别出构建更强能力智能体的可行方向。总体而言,CoALA将当今的语言智能体置于人工智能发展的历史脉络中,勾勒出一条通往基于语言的通用智能路径。