Existing long-term personalized dialogue systems struggle to reconcile unbounded interaction streams with finite context constraints, often succumbing to memory noise accumulation, reasoning degradation, and persona inconsistency. To address these challenges, this paper proposes Inside Out, a framework that utilizes a globally maintained PersonaTree as the carrier of long-term user profiling. By constraining the trunk with an initial schema and updating the branches and leaves, PersonaTree enables controllable growth, achieving memory compression while preserving consistency. Moreover, we train a lightweight MemListener via reinforcement learning with process-based rewards to produce structured, executable, and interpretable {ADD, UPDATE, DELETE, NO_OP} operations, thereby supporting the dynamic evolution of the personalized tree. During response generation, PersonaTree is directly leveraged to enhance outputs in latency-sensitive scenarios; when users require more details, the agentic mode is triggered to introduce details on-demand under the constraints of the PersonaTree. Experiments show that PersonaTree outperforms full-text concatenation and various personalized memory systems in suppressing contextual noise and maintaining persona consistency. Notably, the small MemListener model achieves memory-operation decision performance comparable to, or even surpassing, powerful reasoning models such as DeepSeek-R1-0528 and Gemini-3-Pro.
翻译:现有的长期个性化对话系统难以在无限交互流与有限上下文约束之间取得平衡,常受困于记忆噪声累积、推理能力退化及人设不一致等问题。为应对这些挑战,本文提出“由内而外”(Inside Out)框架,该框架采用全局维护的PersonaTree作为长期用户画像的载体。通过以初始模式约束主干并动态更新枝叶,PersonaTree实现了可控生长,在保持一致性的同时完成记忆压缩。此外,我们通过基于过程的奖励强化学习训练轻量级MemListener模型,使其能生成结构化、可执行且可解释的{ADD, UPDATE, DELETE, NO_OP}操作指令,从而支撑个性化记忆树的动态演进。在响应生成阶段,PersonaTree可直接用于增强时延敏感场景下的输出质量;当用户需要更多细节时,系统将触发代理模式,在PersonaTree的约束下按需引入细节信息。实验表明,PersonaTree在抑制上下文噪声和保持人设一致性方面,均优于全文拼接及多种个性化记忆系统。值得注意的是,小型MemListener模型在记忆操作决策性能上达到甚至超越了DeepSeek-R1-0528与Gemini-3-Pro等强大推理模型的表现。