Large language model (LLM) based recommendation agents personalize what they know through evolving per-user semantic memory, yet how they reason remains a universal, static system prompt shared identically across all users. This asymmetry is a fundamental bottleneck: when a recommendation fails, the agent updates its memory of user preferences but never interrogates the decision logic that produced the failure, leaving its reasoning process structurally unchanged regardless of how many mistakes it accumulates. To address this bottleneck, we propose SAGER (Self-Evolving Agent for Personalized Recommendation), the first recommendation agent framework in which each user is equipped with a dedicated policy skill, a structured natural-language document encoding personalized decision principles that evolves continuously through interaction. SAGER introduces a two-representation skill architecture that decouples a rich evolution substrate from a minimal inference-time injection, an incremental contrastive chain-of-thought engine that diagnoses reasoning flaws by contrasting accepted against unchosen items while preserving accumulated priors, and skill-augmented listwise reasoning that creates fine-grained decision boundaries where the evolved skill provides genuine discriminative value. Experiments on four public benchmarks demonstrate that SAGER achieves state-of-the-art performance, with gains orthogonal to memory accumulation, confirming that personalizing the reasoning process itself is a qualitatively distinct source of recommendation improvement.
翻译:基于大型语言模型(LLM)的推荐智能体通过不断演化的用户语义记忆实现个性化知识存储,然而其推理方式仍采用跨所有用户完全相同的通用静态系统提示。这种非对称性构成根本性瓶颈:当推荐失败时,智能体虽能更新对用户偏好的记忆,却从未审查导致失败结果的决策逻辑,导致无论累积多少错误,其推理过程在结构上始终不变。为突破此瓶颈,我们提出SAGER(自演进个性化推荐智能体),这是首个为每位用户配备专属策略技能的推荐智能体框架——以结构化自然语言文档编码个性化决策原则,并通过交互过程持续演化。SAGER引入双表征技能架构,将丰富的演化基底与极简的推理时注入相解耦;设计增量式对比思维链引擎,通过对比被采纳与未被选中项目诊断推理缺陷,同时保留累积先验知识;并采用技能增强的列表式推理,在演化的技能真正提供判别价值的维度上创建细粒度决策边界。在四个公开基准上的实验表明,SAGER实现了最先进性能,其提升效果与记忆累积正交,证实了推理过程本身的个性化是推荐改进的独特来源。