Memory emerges as the core module in the large language model (LLM)-based agents for long-horizon complex tasks (e.g., multi-turn dialogue, game playing, scientific discovery), where memory can enable knowledge accumulation, iterative reasoning and self-evolution. A number of memory methods have been proposed in the literature. However, these methods have not been systematically and comprehensively compared under the same experimental settings. In this paper, we first summarize a unified framework that incorporates all the existing agent memory methods from a high-level perspective. We then extensively compare representative agent memory methods on two well-known benchmarks and examine the effectiveness of all methods, providing a thorough analysis of those methods. As a byproduct of our experimental analysis, we also design a new memory method by exploiting modules in the existing methods, which outperforms the state-of-the-art methods. Finally, based on these findings, we offer promising future research opportunities. We believe that a deeper understanding of the behavior of existing methods can provide valuable new insights for future research.
翻译:记忆作为大语言模型(LLM)驱动的智能体中用于处理长期复杂任务(如多轮对话、游戏博弈、科学发现)的核心模块,能够实现知识积累、迭代推理与自我进化。已有文献提出了诸多记忆方法,但这些方法尚未在相同实验设置下进行系统全面的比较。本文首先从高层视角归纳了一个统一框架,该框架整合了现有所有智能体记忆方法。继而,我们在两个公认基准上对代表性智能体记忆方法进行广泛对比,检验所有方法的有效性,并提供对这些方法的深入分析。作为实验分析的副产品,我们还通过融合现有方法的模块设计了一种新型记忆方法,其性能超越了当前最先进的方法。最后,基于这些发现,我们提出了有前景的未来研究方向。我们坚信,深入理解现有方法的行为特性能为未来研究提供宝贵的新启示。