ConvMemory: A Lightweight Learned Memory Reranker, a Negative Attribution Result, and a Research-Preview Conflict Editor

We describe ConvMemory, a small 3.6M-parameter learned reranker for conversational long-term memory retrieval, trained with cross-encoder teacher supervision over fused dense and lexical features. On the LongMemEval memory family, ConvMemory operates above the BGE-large cross-encoder in Recall@10 at 12-47x lower latency, remains within 0.025 Recall@10 of mxbai-rerank-large-v1 on Clean500 while running 28x cheaper; under Stress1000 distractors the Recall@10 gap widens to 0.081 but ConvMemory still operates at 117x lower latency; these LongMemEval numbers are single-run or single-seed and are reported as indicative cost-frontier evidence, not benchmark-grade. We then publish a rigorous negative attribution result on a previously claimed mechanism: a five-seed retrained ablation with paired bootstrap shows that ConvMemory's learned temporal window is statistically significant on aggregate but not temporally specific, with the largest effects on hard non-temporal controls and no significant effect on multi-hop temporal queries. The honest description of the mechanism is cheap cross-encoder distillation in a fused dense+lexical feature space, not temporal-structure exploitation. We additionally release CCGE-LA, a low-amplitude conflict-aware candidate-set editor over ConvMemory, as a research preview with modest but consistent gains on supersession and stale/rescue slices on LoCoMo. All results are retrieval-stage; ConvMemory does not match mxbai-rerank-large-v1 in absolute LoCoMo MRR, and the report is single-author and not yet independently audited.

翻译：本文提出ConvMemory，一种参数量为360万的小型学习型重排序器，专用于对话长期记忆检索，采用交叉编码器教师监督方式，基于融合稠密与词汇特征进行训练。在LongMemEval记忆族测试中，ConvMemory在Recall@10指标上超越BGE-large交叉编码器，延迟降低12-47倍；在Clean500数据集上，其Recall@10与mxbai-rerank-large-v1的差距在0.025以内，但运行成本降低28倍；而在Stress1000干扰项条件下，Recall@10差距扩大至0.081，但ConvMemory仍实现117倍延迟降低。上述LongMemEval结果均为单次运行或单种子实验，仅作为成本-性能前沿的指示性证据，而非基准测试级结论。随后，我们针对先前声称的某项机制发表了严格的否定归因结果：基于五种子重训练的消融实验与配对Bootstrap检验表明，ConvMemory的学习型时间窗口在统计上具有整体显著性，但缺乏时间特异性——其对硬非时间控制组的影响最大，而对多跳时间查询无显著效果。该机制的诚实描述应为：在融合稠密与词汇特征空间中的廉价交叉编码器蒸馏，而非时间结构利用。此外，我们发布了CCGE-LA（低幅度冲突感知候选集编辑器），作为ConvMemory的研究预览工具。在LoCoMo数据集上，该编辑器在超期、陈旧/救援等切片中取得适度但一致的增益。所有结果均限于检索阶段；ConvMemory在LoCoMo的绝对MRR指标上未达到mxbai-rerank-large-v1的水平，且本报告为单作者研究，尚未经过独立审计。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

管理 LLM 智能体中的演进式记忆：风险、机理及稳定性与安全性受控记忆（SSGM）框架

专知会员服务

16+阅读 · 3月14日

DeepSeek开源大模型「记忆」模块，D梁文锋署名新论文，下一代稀疏模型提前剧透

专知会员服务

18+阅读 · 1月13日