Large Language Models (LLMs) based agents excel at diverse tasks, yet they suffer from brittle procedural memory that is manually engineered or entangled in static parameters. In this work, we investigate strategies to endow agents with a learnable, updatable, and lifelong procedural memory. We propose Memp that distills past agent trajectories into both fine-grained, step-by-step instructions and higher-level, script-like abstractions, and explore the impact of different strategies for Build, Retrieval, and Update of procedural memory. Coupled with a dynamic regimen that continuously updates, corrects, and deprecates its contents, this repository evolves in lockstep with new experience. Empirical evaluation on TravelPlanner and ALFWorld shows that as the memory repository is refined, agents achieve steadily higher success rates and greater efficiency on analogous tasks. Moreover, procedural memory built from a stronger model retains its value: migrating the procedural memory to a weaker model can also yield substantial performance gains. Code is available at https://github.com/zjunlp/MemP.
翻译:基于大语言模型(LLM)的智能体在多样化任务中表现出色,但其程序性记忆存在脆弱性——通常依赖于人工设计或固化于静态参数之中。本研究旨在探索赋予智能体可学习、可更新、终身持续的程序性记忆的策略。我们提出Memp框架,该框架将历史智能体轨迹提炼为细粒度的逐步指令与高层级的脚本式抽象,并系统探究程序性记忆的构建、检索与更新等不同策略的影响。通过持续更新、修正与淘汰内容的动态机制,该记忆库能够伴随新经验同步演进。在TravelPlanner与ALFWorld基准上的实证评估表明:随着记忆库的持续优化,智能体在同类任务中实现了稳步提升的成功率与执行效率。此外,由更强模型构建的程序性记忆具有可迁移价值——将其迁移至较弱模型同样能带来显著的性能提升。代码已发布于 https://github.com/zjunlp/MemP。