TOKI: A Bitemporal Operator Algebra for Contradiction Resolution in LLM-Agent Persistent Memory

from arxiv, 43 pages including full appendices (proofs, protocols, and reproducibility ledger). Code, data, and reproducibility artifact: https://github.com/ZenAlexa/toki-bitemporal-memory

Persistent memory for an LLM agent is a write-heavy substrate: every belief update is a versioned write, and a new claim may contradict a stored one. Production systems use four resolution heuristics (last-writer-wins, evidence-weighted merge, await-confirmation, per-rule policy), yet none declares the isolation level it assumes or the write-time anomalies it admits. We show that contradiction resolution is write-time concurrency control and make the missing contract explicit. TOKI types the four heuristics as one family of bitemporal operators over a dual-row schema, each with an isolation precondition and a provenance annotation that preserves the losing fact in an audit row. Four soundness theorems close the contract across isolation, schema, and provenance, lift the guarantees to operator pipelines, and extend the fold operators to n-ary conflict sets. A tightness companion proves that, within the relational schedule model, keyed logging of the adjudicating judge is necessary for replay consistency, which every audited baseline omits. A verdict matrix over eight systems localizes the gap: every baseline that keeps a language-model judge on the write path admits at least one of three write-time anomalies (replay inconsistency, belief-drift skew, audit erasure); a content-addressed engine-layer comparator avoids them only by removing the judge, and TOKI alone excludes all three while keeping it. On its one natural-workload slice the audit-row defence moves LoCoMo by 0.86, and ablating the typed memory layer removes 0.49 accuracy on 1,444 answerable LoCoMo questions; the cross-system comparison stays underpowered and claims no superiority. The contribution is the contract: a write-time correctness specification, proved sound across isolation, schema, and provenance, pinning the guarantee every production heuristic assumes but no deployed system makes explicit.

翻译：LLM智能体持久化记忆是一个写密集型底层系统：每次信念更新都是一次版本化写入，新断言可能与已有存储断言相矛盾。生产系统采用四种解决启发式策略（最后写入者获胜、证据加权合并、等待确认、按规则策略），但均未声明其假设的隔离级别或允许的写入时异常。我们证明矛盾化解本质上是写入时并发控制，并明确显式化缺失的规约。TOKI将四种启发式策略归类为基于双行模式的双时态算子家族，每个算子均包含隔离前置条件和保留失败事实至审计行的出处标注。四个完整性定理通过隔离、模式和出处间的规约闭环，将保证提升至算子流水线，并扩展fold算子以支持n元冲突集。紧致性补定理证明：在关系型调度模型内，仲裁裁判的关键键日志记录对于重放一致性是必要的（而所有经审计的基线均忽略此要求）。基于八个系统的裁决矩阵定位了差距：所有将语言模型裁判置于写入路径的基线至少呈现三种写入时异常之一（重放不一致、信念漂移偏差、审计擦除）；基于内容寻址的引擎层比较器通过移除裁判避免异常，而唯有TOKI在保留裁判的前提下排除所有三种异常。在其单一自然工作负载切片上，审计行防御使LoCoMo提升0.86分；剥离类型化记忆层后，在1444个可回答的LoCoMo问题上精度降低0.49。跨系统比较因统计效力不足而不宣称优越性。本文贡献在于该规约：一个经隔离、模式和出处完整性证明的写入时正确性规范，明确了每个生产启发式策略假设但无部署系统显式化的保证。