Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory

Memory is increasingly central to Large Language Model (LLM) agents operating beyond a single context window, yet most existing systems rely on offline, query-agnostic memory construction that can be inefficient and may discard query-critical information. Although runtime memory utilization is a natural alternative, prior work often incurs substantial overhead and offers limited explicit control over the performance-cost trade-off. In this work, we present \textbf{BudgetMem}, a runtime agent memory framework for explicit, query-aware performance-cost control. BudgetMem structures memory processing as a set of memory modules, each offered in three budget tiers (i.e., \textsc{Low}/\textsc{Mid}/\textsc{High}). A lightweight router performs budget-tier routing across modules to balance task performance and memory construction cost, which is implemented as a compact neural policy trained with reinforcement learning. Using BudgetMem as a unified testbed, we study three complementary strategies for realizing budget tiers: implementation (method complexity), reasoning (inference behavior), and capacity (module model size). Across LoCoMo, LongMemEval, and HotpotQA, BudgetMem surpasses strong baselines when performance is prioritized (i.e., high-budget setting), and delivers better accuracy-cost frontiers under tighter budgets. Moreover, our analysis disentangles the strengths and weaknesses of different tiering strategies, clarifying when each axis delivers the most favorable trade-offs under varying budget regimes.

翻译：记忆对于超越单一上下文窗口运行的大型语言模型（LLM）智能体日益重要，然而现有系统大多依赖离线、查询无关的记忆构建方式，这种方式效率低下且可能丢失查询关键信息。尽管运行时记忆利用是一种自然的替代方案，但先前工作往往带来显著开销，且在性能与成本的权衡上缺乏明确的控制。本文提出 \textbf{BudgetMem}，一种用于实现明确、查询感知的性能-成本控制的运行时智能体记忆框架。BudgetMem 将记忆处理构建为一组记忆模块，每个模块提供三种预算层级（即 \textsc{低}/\textsc{中}/\textsc{高}）。一个轻量级路由器在模块间执行预算层级路由，以平衡任务性能与记忆构建成本，该路由器实现为一个通过强化学习训练的紧凑神经策略。以 BudgetMem 作为统一测试平台，我们研究了实现预算层级的三种互补策略：实现方式（方法复杂度）、推理过程（推理行为）和容量（模块模型大小）。在 LoCoMo、LongMemEval 和 HotpotQA 数据集上的实验表明，当优先考虑性能时（即高预算设置），BudgetMem 超越了强基线方法；在更严格的预算约束下，BudgetMem 能提供更优的精度-成本边界。此外，我们的分析揭示了不同层级策略的优势与局限，阐明了在不同预算机制下各策略轴何时能提供最有利的权衡。