SmartSearch: How Ranking Beats Structure for Conversational Memory Retrieval

Recent conversational memory systems invest heavily in LLM-based structuring at ingestion time and learned retrieval policies at query time. We show that neither is necessary. SmartSearch retrieves from raw, unstructured conversation history using a fully deterministic pipeline: NER-weighted substring matching for recall, rule-based entity discovery for multi-hop expansion, and a CrossEncoder+ColBERT rank fusion stage -- the only learned component -- running on CPU in ~650ms. Oracle analysis on two benchmarks identifies a compilation bottleneck: retrieval recall reaches 98.6%, but without intelligent ranking only 22.5% of gold evidence survives truncation to the token budget. With score-adaptive truncation and no per-dataset tuning, SmartSearch achieves 93.5% on LoCoMo and 88.4% on LongMemEval-S, exceeding all known memory systems under the same evaluation protocol on both benchmarks while using 8.5x fewer tokens than full-context baselines.

翻译：近期对话记忆系统在摄入时大量投入基于LLM的结构化处理，并在查询时采用学习型检索策略。本文证明两者均非必需。SmartSearch通过完全确定性的流程从原始非结构化对话历史中检索：采用NER加权的子串匹配实现召回，基于规则的实体发现进行多跳扩展，以及CrossEncoder+ColBERT排序融合阶段——这是唯一的学习组件——在CPU上以约650毫秒运行。在两个基准测试上的Oracle分析揭示了编译瓶颈：检索召回率达到98.6%，但若缺乏智能排序，在截断至令牌预算后仅有22.5%的关键证据得以保留。通过分数自适应截断技术且无需针对各数据集调优，SmartSearch在LoCoMo上达到93.5%准确率，在LongMemEval-S上达到88.4%准确率，在相同评估协议下超越两个基准测试中所有已知记忆系统，同时比全上下文基线少用8.5倍令牌数。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

大语言模型智能体长期记忆安全性综述：迈向记忆主权

专知会员服务

16+阅读 · 4月23日

管理 LLM 智能体中的演进式记忆：风险、机理及稳定性与安全性受控记忆（SSGM）框架

专知会员服务

17+阅读 · 3月14日

智能体记忆深度剖析：评价指标与系统局限性的分类体系及实证分析

专知会员服务

22+阅读 · 2月26日

基于图结构的智能体记忆机制：分类体系、关键技术与应用综述

专知会员服务

32+阅读 · 2月6日