Stateful tool-using LLM agents treat the context window as working memory, yet today's agent harnesses manage residency and durability as best-effort, causing recurring failures: lost state after compaction, bypassed flushes on reset, and destructive writeback. We present \textsc{ClawVM}, a virtual memory layer that manages state as typed pages with minimum-fidelity invariants, multi-resolution representations under a token budget, and validated writeback at every lifecycle boundary. Because the harness already assembles prompts, mediates tools, and observes lifecycle events, it is the natural enforcement point; placing the contract there makes residency and durability deterministic and auditable. Across synthetic workloads, 12 real-session traces, and adversarial stress tests, \textsc{ClawVM} eliminates all policy-controllable faults whenever the minimum-fidelity set fits within the token budget, confirmed by an offline oracle, and adds median <50 microseconds of policy-engine overhead per turn.
翻译:有状态工具型大型语言模型(LLM)代理将上下文窗口视为工作内存,然而当前的代理驾驭系统以尽力而为的方式管理常驻性与持久性,导致反复出现的故障:压缩后状态丢失、重置时刷新绕过、以及破坏性回写。我们提出ClawVM,一个虚拟内存层,它通过最小保真度不变量的类型化页面、令牌预算下的多分辨率表示、以及每个生命周期边界处的验证性回写来管理状态。由于驾驭系统已负责组装提示词、中介工具调用并观察生命周期事件,它天然是执行强制控制的节点;在此处放置契约使常驻性与持久性具有确定性与可审计性。在合成工作负载、12条真实会话轨迹以及对抗性压力测试中,只要最小保真度集能容纳于令牌预算内,ClawVM即可消除所有策略可控故障(经离线先知验证),且每轮策略引擎开销中位数小于50微秒。