Large language model (LLM) agents are increasingly expected to operate over long-term interactions, where information from past dialogues must be preserved and recalled to support future tasks. However, as interactions accumulate, the memory store grows without bound and fills with redundant entries that inflate storage cost and degrade retrieval by crowding out the most useful evidence. Furthermore, this is especially limiting on resource-constrained platforms with hard memory budgets, motivating us to formulate storage-budgeted memory management, the task of keeping an already constructed memory store within a fixed budget while preserving information useful for future interactions. To this end, we then propose MemRefine, an LLM-guided framework that, since surface similarity poorly reflects factual value, uses similarity only to propose candidate pairs and defers delete, merge, and preserve decisions to an LLM judge based on factual content, iterating until the budget is met. Across multiple memory frameworks and long-term conversation benchmarks, MemRefine consistently meets target budgets while preserving downstream performance and outperforming rule-based baselines under tight budgets.
翻译:大型语言模型(LLM)智能体日益需要在长期交互中运行,必须保存并调用过往对话中的信息以支持未来任务。然而,随着交互不断累积,记忆存储无限增长,冗余条目充斥其中,既抬高存储成本,又因挤占最有价值的证据而损害检索效果。这一问题在内存预算受限的资源受限平台上尤为突出,激励我们提出“存储预算约束下的记忆管理”任务——即在固定预算内维护已有记忆存储,同时保留对未来交互有用的信息。为此,我们提出MemRefine框架,该框架基于LLM引导,鉴于表面相似性难以反映事实价值,仅利用相似性来提议候选配对,并将删除、合并或保留的决策交由基于事实内容的LLM评判器处理,迭代执行直至满足预算要求。在多种记忆框架和长时对话基准测试中,MemRefine在实现目标预算的同时保持下游性能,且在严格预算下优于基于规则的基准方法。