Most Large Language Model (LLM) agent memory systems rely on a small set of static, hand-designed operations for extracting memory. These fixed procedures hard-code human priors about what to store and how to revise memory, making them rigid under diverse interaction patterns and inefficient on long histories. To this end, we present \textbf{MemSkill}, which reframes these operations as learnable and evolvable memory skills, structured and reusable routines for extracting, consolidating, and pruning information from interaction traces. Inspired by the design philosophy of agent skills, MemSkill employs a \emph{controller} that learns to select a small set of relevant skills, paired with an LLM-based \emph{executor} that produces skill-guided memories. Beyond learning skill selection, MemSkill introduces a \emph{designer} that periodically reviews hard cases where selected skills yield incorrect or incomplete memories, and evolves the skill set by proposing refinements and new skills. Together, MemSkill forms a closed-loop procedure that improves both the skill-selection policy and the skill set itself. Experiments on LoCoMo, LongMemEval, HotpotQA, and ALFWorld demonstrate that MemSkill improves task performance over strong baselines and generalizes well across settings. Further analyses shed light on how skills evolve, offering insights toward more adaptive, self-evolving memory management for LLM agents.
翻译:现有大多数大型语言模型(LLM)智能体记忆系统依赖少量静态、人工设计的操作进行记忆提取。这些固定流程将人类关于存储内容与记忆修订方式的先验知识硬编码其中,导致其在多样化交互模式下缺乏灵活性,且在长历史记录上效率低下。为此,我们提出 **MemSkill**,将此类操作重构为可学习、可演化的记忆技能——即从交互轨迹中提取、整合与修剪信息的结构化可复用例程。受智能体技能设计理念启发,MemSkill 采用一个学习选择少量相关技能的*控制器*,并搭配基于 LLM 的*执行器*以生成技能引导的记忆。除学习技能选择外,MemSkill 还引入一个*设计器*,定期审查因所选技能产生错误或不完整记忆的困难案例,并通过提出改进方案与新技能来演化技能集合。MemSkill 由此形成一个闭环流程,同步优化技能选择策略与技能集合本身。在 LoCoMo、LongMemEval、HotpotQA 与 ALFWorld 上的实验表明,MemSkill 在多项任务性能上超越强基线模型,并展现出良好的跨场景泛化能力。进一步分析揭示了技能的演化机制,为构建更具适应性、可自进化的 LLM 智能体记忆管理系统提供了新见解。