Persistent AI memory is often reduced to a retrieval problem: store prior interactions as text, embed them, and ask the model to recover relevant context later. This design is useful for thematic recall, but it is mismatched to the kinds of memory that agents need in production: exact facts, current state, updates and deletions, aggregation, relations, negative queries, and explicit unknowns. These operations require memory to behave less like search and more like a system of record. This paper argues that reliable external AI memory must be schema-grounded. Schemas define what must be remembered, what may be ignored, and which values must never be inferred. We present an iterative, schema-aware write path that decomposes memory ingestion into object detection, field detection, and field-value extraction, with validation gates, local retries, and stateful prompt control. The result shifts interpretation from the read path to the write path: reads become constrained queries over verified records rather than repeated inference over retrieved prose. We evaluate this design on structured extraction and end-to-end memory benchmarks. On the extraction benchmark, the judge-in-the-loop configuration reaches 90.42% object-level accuracy and 62.67% output accuracy, above all tested frontier structured-output baselines. On our end-to-end memory benchmark, xmemory reaches 97.10% F1, compared with 80.16%-87.24% across the third-party baselines. On the application-level task, xmemory reaches 95.2% accuracy, outperforming specialised memory systems, code-generated Markdown harnesses, and customer-facing frontier-model application harnesses. The results show that, for memory workloads requiring stable facts and stateful computation, architecture matters more than retrieval scale or model strength alone.
翻译:持久化的人工智能记忆常被简化为检索问题:将历史交互存储为文本、进行嵌入处理、并在后续让模型恢复相关上下文。这种设计对主题回忆有效,但与生产环境中智能体所需的记忆类型不匹配——精确事实、当前状态、更新删除、聚合、关联关系、否定查询和显式未知项。这些操作要求记忆系统更像记录系统而非搜索系统。本文论证可靠的外部AI记忆必须基于模式化框架。模式定义了必须记住的内容、可以忽略的内容以及绝不能推断的值。我们提出一种迭代式模式感知写入路径,将记忆摄取分解为目标对象检测、字段检测和字段值提取三个环节,并配合验证门控、本地重试和状态化提示控制。该方案将解释负担从读取路径转移到写入路径:读取操作变为对已验证记录的约束查询,而非对检索文本的反复推断。我们在结构化提取和端到端记忆基准上评估该设计。在提取基准测试中,包含法官参与的循环配置达到90.42%的对象级准确率和62.62%的输出准确率,超越所有前沿结构化输出基线模型。在端到端记忆基准上,xmemory系统F1值达97.10%,而第三方基线模型在80.16%-87.24%区间。在应用级任务中,xmemory准确率达95.2%,优于专用记忆系统、代码生成Markdown框架和面向客户的前沿模型应用框架。结果表明,对于需要稳定事实和状态化计算的记忆任务,架构设计比检索规模或模型强度更为重要。