Despite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and computational overhead. To this end, we introduce a new memory system called LightMem, which strikes a balance between the performance and efficiency of memory systems. Inspired by the Atkinson-Shiffrin model of human memory, LightMem organizes memory into three complementary stages. First, cognition-inspired sensory memory rapidly filters irrelevant information through lightweight compression and groups information according to their topics. Next, topic-aware short-term memory consolidates these topic-based groups, organizing and summarizing content for more structured access. Finally, long-term memory with sleep-time update employs an offline procedure that decouples consolidation from online inference. Experiments on LongMemEval with GPT and Qwen backbones show that LightMem outperforms strong baselines in accuracy (up to 10.9% gains) while reducing token usage by up to 117x, API calls by up to 159x, and runtime by over 12x. The code is available at https://github.com/zjunlp/LightMem.
翻译:尽管大型语言模型(LLMs)具备卓越能力,但在动态复杂环境中仍难以有效利用历史交互信息。记忆系统通过引入持久化信息存储、检索与利用机制,使LLMs能够超越无状态交互。然而,现有记忆系统通常带来显著的时间与计算开销。为此,我们提出名为LightMem的新型记忆系统,在记忆系统性能与效率间实现平衡。受人类记忆的Atkinson-Shiffrin模型启发,LightMem将记忆组织为三个互补阶段:首先,认知启发的感官记忆通过轻量级压缩快速过滤无关信息,并按主题对信息进行分组;其次,主题感知的短期记忆整合这些主题分组,通过内容组织与摘要实现结构化访问;最后,采用睡眠时更新的长期记忆,通过离线处理实现巩固过程与在线推理的解耦。基于GPT与Qwen骨干网络在LongMemEval上的实验表明,LightMem在准确率上优于强基线(最高提升10.9%),同时将令牌使用量降低达117倍、API调用减少达159倍、运行时间缩短超12倍。代码已发布于https://github.com/zjunlp/LightMem。