Human-agent dialogues often exhibit topic continuity-a stable thematic frame that evolves through temporally adjacent exchanges-yet most large language model (LLM) agent memory systems fail to preserve it. Existing designs follow a fragmentation-compensation paradigm: they first break dialogue streams into isolated utterances for storage, then attempt to restore coherence via embedding-based retrieval. This process irreversibly damages narrative and causal flow, while biasing retrieval towards lexical similarity. We introduce membox, a hierarchical memory architecture centered on a Topic Loom that continuously monitors dialogue in a sliding-window fashion, grouping consecutive same-topic turns into coherent "memory boxes" at storage time. Sealed boxes are then linked by a Trace Weaver into long-range event-timeline traces, recovering macro-topic recurrences across discontinuities. Experiments on LoCoMo demonstrate that Membox achieves up to 68% F1 improvement on temporal reasoning tasks, outperforming competitive baselines (e.g., Mem0, A-MEM). Notably, Membox attains these gains while using only a fraction of the context tokens required by existing methods, highlighting a superior balance between efficiency and effectiveness. By explicitly modeling topic continuity, Membox offers a cognitively motivated mechanism for enhancing both coherence and efficiency in LLM agents.
翻译:人机对话常呈现话题连续性——一种通过时间相邻的交流而演变的稳定主题框架——然而大多数大型语言模型(LLM)智能体记忆系统未能保持这一特性。现有设计遵循“碎片化-补偿”范式:先将对话流拆解为孤立话语进行存储,再尝试通过基于嵌入的检索恢复连贯性。这一过程不可逆地破坏了叙事与因果流,同时使检索偏向词汇相似性。我们提出membox,一种以“话题织机”(Topic Loom)为核心的分层记忆架构,该组件以滑动窗口方式持续监控对话,在存储时将连续同话题轮次组织为连贯的“记忆盒”。已封装的记忆盒随后通过“轨迹编织器”(Trace Weaver)连接为长程事件时间线轨迹,从而恢复跨越间断的宏观话题复现。在LoCoMo数据集上的实验表明,Membox在时序推理任务上实现了最高68%的F1值提升,优于现有基线方法(如Mem0、A-MEM)。值得注意的是,Membox仅需使用现有方法所需上下文词元的一小部分即可达成上述性能,彰显了效率与效能间的更优平衡。通过对话题连续性进行显式建模,Membox提供了一种受认知启发的机制,可同时增强LLM智能体的连贯性与效率。