What Spatial Memory Must Store: Occlusion as the Test for Language-Agent Memory

Language-agent "memory palace" systems anchor each memory to a world coordinate, on the intuition that geometry adds something text cannot. We make that intuition testable and report three results. First, the memory-palace default of folding spatial proximity into a linear blend beside recency and importance does not help and can hurt: in a pre-registered recall experiment the shipped blend fails its own frozen test (mean Delta-Hit@5 -0.0375, Wilcoxon p=0.306), sitting at a position-blind baseline, while a geometry-led weighting wins decisively (+0.3208, p<10^-15): geometry must lead recall when the query regime is spatial. Second, memory recall and visibility must be separated: recall is occlusion-blind by design (you correctly remember the next room behind a wall), while visibility is a perception predicate over stored geometry that the live system never computed. A one-line ray-versus-voxel digital differential analyzer (DDA), re-pointed from the gaze ray the agent already casts, supplies it: text and the live FoV cone both score 0.000 on 849 behind-wall targets while cone-plus-DDA reaches 0.982 (exact McNemar p<10^-6); coordinate recall separately resolves near-duplicate locations a cosine null cannot (1.000 vs 0.533, n=150). Third, the visibility predicate is confirmed live under a git-committed pre-registration (SPMEM-OCC-LIVE-v1: eight scripted worlds, automated oracle scoring, 96 behind-wall targets, false-visible 1.000->0.000, pooled exact McNemar p=2.5x10^-29), a run that surfaced and fixed a real relay anchor defect. We concede that occlusion-needs-geometry is near-tautological; the contribution is the measurement and isolation, separating what spatial memory must store from how it is read. These pilots power a frozen confirmatory study (SPMEM-ZERO-REAL-PREREG-v1); the full human-authored multi-world study with blind raters remains future work.

翻译：语言智能体的"记忆宫殿"系统将每条记忆锚定到世界坐标上，其直觉在于几何信息能提供纯文本无法承载的内容。我们使该直觉可测试并报告三项结果。第一，记忆宫殿默认将空间邻近性折叠进时间近因与重要性的线性混合中（而非独立引导检索）的做法并无助益甚至可能有害：在预注册的回忆实验中，该默认混合未通过其冻结测试（平均Delta-Hit@5 -0.0375，Wilcoxon检验p=0.306），性能与无视位置的基线持平；而几何主导的加权方案则显著胜出（+0.3208，p<10^-15）：当查询模式为空间时，几何必须主导回忆。第二，回忆与可见性必须分离：回忆本质上是遮挡盲视的（你正确记得墙后下一个房间），而可见性是实时系统从未计算的、基于存储几何的感知谓词。一行光线-体素数字差分分析器（DDA），重指向智能体已投射的注视光线即可实现：纯文本与实时视场锥体在849个墙后目标上均得0.000分，而锥体+DDA达到0.982（精确McNemar检验p<10^-6）；坐标回忆能分别解决余弦零空间无法区分的近副本位置（1.000 vs 0.533，n=150）。第三，该可见性谓词通过git提交的预注册方案得到实时验证（SPMEM-OCC-LIVE-v1：八个脚本化世界，自动评判计分，96个墙后目标，误报可见率1.000→0.000，合并精确McNemar检验p=2.5×10^-29），该运行发现并修复了一个真实的中继锚点缺陷。我们承认"遮挡需要几何"近乎同义反复；本贡献在于测量与隔离，即区分空间记忆必须存储的内容与读取方式。这些预实验为一项冻结验证研究（SPMEM-ZERO-REAL-PREREG-v1）提供支撑；带有盲评者的完整人工撰写多世界研究仍属未来工作。