Personalized large language models (LLMs) rely on memory retrieval to incorporate user-specific histories, preferences, and contexts. Existing approaches either overload the LLM by feeding all the user's past memory into the prompt, which is costly and unscalable, or simplify retrieval into a one-shot similarity search, which captures only surface matches. Cognitive science, however, shows that human memory operates through a dual process: Familiarity, offering fast but coarse recognition, and Recollection, enabling deliberate, chain-like reconstruction for deeply recovering episodic content. Current systems lack both the ability to perform recollection retrieval and mechanisms to adaptively switch between the dual retrieval paths, leading to either insufficient recall or the inclusion of noise. To address this, we propose RF-Mem (Recollection-Familiarity Memory Retrieval), a familiarity uncertainty-guided dual-path memory retriever. RF-Mem measures the familiarity signal through the mean score and entropy. High familiarity leads to the direct top-K Familiarity retrieval path, while low familiarity activates the Recollection path. In the Recollection path, the system clusters candidate memories and applies alpha-mix with the query to iteratively expand evidence in embedding space, simulating deliberate contextual reconstruction. This design embeds human-like dual-process recognition into the retriever, avoiding full-context overhead and enabling scalable, adaptive personalization. Experiments across three benchmarks and corpus scales demonstrate that RF-Mem consistently outperforms both one-shot retrieval and full-context reasoning under fixed budget and latency constraints. Our code can be found in the Reproducibility Statement.
翻译:个性化大语言模型(LLM)依赖记忆检索来整合用户特定的历史、偏好与上下文。现有方法要么将所有用户过往记忆全部输入提示词,导致模型过载、成本高昂且难以扩展;要么将检索简化为单次相似度搜索,仅能捕捉表层匹配。然而,认知科学表明,人类记忆通过双重过程运作:熟悉度提供快速但粗略的识别,而回忆则支持有意识的链式重建,以深度恢复情景内容。现有系统既缺乏执行回忆检索的能力,也缺少在双重检索路径间自适应切换的机制,导致召回不足或引入噪声。为此,我们提出RF-Mem(回忆-熟悉度记忆检索),一种基于熟悉度不确定性引导的双路径记忆检索器。RF-Mem通过均值分数与熵来度量熟悉度信号。高熟悉度触发直接的前K项熟悉度检索路径,低熟悉度则激活回忆路径。在回忆路径中,系统对候选记忆进行聚类,并采用查询的alpha混合在嵌入空间中迭代扩展证据,模拟有意识的上下文重建。该设计将类人的双重过程识别嵌入检索器,避免了全上下文开销,实现了可扩展的自适应个性化。在三个基准测试及不同语料规模上的实验表明,在固定预算与延迟约束下,RF-Mem始终优于单次检索与全上下文推理方法。我们的代码可在可复现性声明中获取。