PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents

Large language model (LLM) agents increasingly operate over long and recurring external contexts, like document corpora and code repositories. Across invocations, existing approaches preserve either the agent's trajectory, passive access to raw material, or task-level strategies. None of them preserves what we argue is most needed for repeated same-context workloads: reusable orientation knowledge (e.g., what the context contains, how it is organized, and which entities, constants, and schemas have historically been useful) about the recurring context itself. We introduce PEEK, a system that caches and maintains this orientation knowledge as a context map: a small, constant-sized artifact in the agent's prompt that gives it a persistent peek into the external context. The map is maintained by a programmable cache policy with three modules: a Distiller that extracts transferable knowledge from inference-time signals, a Cartographer that translates it into structured edits, and a priority-based Evictor that enforces a fixed token budget. On long-context reasoning and information aggregation, PEEK improves over strong baselines by 6.3-34.0% while using 93-145 fewer iterations and incurring 1.7-5.8x lower cost than the state-of-the-art prompt-learning framework, ACE. On context learning, PEEK improves solving rate and rubric accuracy by 6.0-14.0% and 7.8-12.1%, respectively, at 1.4x lower cost than ACE. These gains generalize across LMs and agent architectures, including OpenAI Codex, a production-grade coding agent. Together, these results show that a context map helps long-context LLM agents interact with recurring external contexts more accurately and efficiently.

翻译：摘要：大语言模型（LLM）代理日益在长且重复的外部上下文（如文档语料库和代码仓库）中运行。现有方法在多次调用中仅保留代理的轨迹、对原始材料的被动访问或任务级策略，但均未保留我们认为重复同上下文工作负载中最需要的元素：关于重复上下文本身的可复用方向知识（例如上下文包含的内容、组织方式，以及历史上哪些实体、常量和模式有用）。我们提出PEEK系统，该系统将此类方向知识缓存并维护为上下文地图：一种嵌入代理提示中的小型恒定大小工件，使其能持续“窥视”外部上下文。该地图通过可编程缓存策略维护，包含三个模块：蒸馏器从推理时信号中提取可迁移知识，制图器将其转化为结构化编辑，以及基于优先级的驱逐器强制执行固定令牌预算。在长上下文推理与信息聚合任务中，PEEK相较于强基线方法提升6.3%-34.0%，同时减少93-145次迭代迭代次数，且成本比最先进的提示学习框架ACE低1.7-5.8倍。在上下文学习方面，PEEK在降低成本至ACE的1.4倍的同时，将解决率和评分准确率分别提升6.0%-14.0%和7.8%-12.1%。这些改进可泛化至不同语言模型和代理架构，包括生产级编码代理OpenAI Codex。综合结果表明，上下文地图能帮助长上下文LLM代理更准确、高效地与重复外部上下文交互。