Recent efforts have augmented large language models (LLMs) with external resources (e.g., the Internet) or internal control flows (e.g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents. While these agents have achieved substantial empirical success, we lack a systematic framework to organize existing agents and plan future developments. In this paper, we draw on the rich history of cognitive science and symbolic artificial intelligence to propose Cognitive Architectures for Language Agents (CoALA). CoALA describes a language agent with modular memory components, a structured action space to interact with internal memory and external environments, and a generalized decision-making process to choose actions. We use CoALA to retrospectively survey and organize a large body of recent work, and prospectively identify actionable directions towards more capable agents. Taken together, CoALA contextualizes today's language agents within the broader history of AI and outlines a path towards language-based general intelligence.
翻译:近期研究通过为大型语言模型(LLMs)配备外部资源(如互联网)或内部控制流(如提示链),以完成需要基础或推理的任务,从而催生了一类新型语言代理。尽管这些代理取得了显著的实证成功,但我们仍缺乏一个系统化的框架来组织现有代理并规划未来发展。本文借鉴认知科学与符号人工智能的深厚历史,提出了语言代理的认知架构(CoALA)。CoALA将语言代理描述为具有模块化记忆组件、结构化动作空间(用于与内部记忆及外部环境交互)以及广义决策过程(用于选择动作)。我们利用CoALA回顾并系统梳理了大量近期工作,同时前瞻性地识别了构建更强大代理的可行方向。综上所述,CoALA将当代语言代理置于人工智能更广泛的历史脉络中,并勾勒出一条通往基于语言的通用智能的路径。