To support long-term interaction in complex environments, LLM agents require memory systems that manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to substantial redundancy, or rely on iterative reasoning to filter noise, incurring high token costs. To address this challenge, we introduce SimpleMem, an efficient memory framework based on semantic lossless compression. We propose a three-stage pipeline designed to maximize information density and token utilization: (1) Semantic Structured Compression, which distills unstructured interactions into compact, multi-view indexed memory units; (2) Online Semantic Synthesis, an intra-session process that instantly integrates related context into unified abstract representations to eliminate redundancy; and (3) Intent-Aware Retrieval Planning, which infers search intent to dynamically determine retrieval scope and construct precise context efficiently. Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost, achieving an average F1 improvement of 26.4% while reducing inference-time token consumption by up to 30-fold, demonstrating a superior balance between performance and efficiency. Code is available at https://github.com/aiming-lab/SimpleMem.
翻译:为支持在复杂环境中的长期交互,LLM智能体需要能够管理历史经验的记忆系统。现有方法要么通过被动扩展上下文保留完整的交互历史,导致大量冗余;要么依赖迭代推理过滤噪声,产生高昂的令牌开销。为应对这一挑战,我们提出了SimpleMem——一种基于语义无损压缩的高效记忆框架。我们设计了一个三阶段流程,旨在最大化信息密度与令牌利用率:(1)语义结构化压缩:将非结构化交互提炼为紧凑的多视图索引记忆单元;(2)在线语义合成:一种会话内处理过程,即时整合相关上下文为统一的抽象表示以消除冗余;(3)意图感知检索规划:通过推断搜索意图动态确定检索范围,高效构建精确上下文。在基准数据集上的实验表明,本方法在准确性、检索效率与推理成本方面均优于基线方法,平均F1值提升26.4%,同时将推理阶段令牌消耗降低高达30倍,实现了性能与效率的卓越平衡。代码发布于https://github.com/aiming-lab/SimpleMem。