Large language model (LLM) based recommendation agents personalize what they know through evolving per-user semantic memory, yet how they reason remains a universal, static system prompt shared identically across all users. This asymmetry is a fundamental bottleneck: when a recommendation fails, the agent updates its memory of user preferences but never interrogates the decision logic that produced the failure, leaving its reasoning process structurally unchanged regardless of how many mistakes it accumulates. To address this bottleneck, we propose SAGER (Self-Evolving Agent for Personalized Recommendation), the first recommendation agent framework in which each user is equipped with a dedicated policy skill, a structured natural-language document encoding personalized decision principles that evolves continuously through interaction. SAGER introduces a two-representation skill architecture that decouples a rich evolution substrate from a minimal inference-time injection, an incremental contrastive chain-of-thought engine that diagnoses reasoning flaws by contrasting accepted against unchosen items while preserving accumulated priors, and skill-augmented listwise reasoning that creates fine-grained decision boundaries where the evolved skill provides genuine discriminative value. Experiments on four public benchmarks demonstrate that SAGER achieves state-of-the-art performance, with gains orthogonal to memory accumulation, confirming that personalizing the reasoning process itself is a qualitatively distinct source of recommendation improvement.
翻译:基于大语言模型的推荐Agent通过不断演化的用户语义记忆来个性化其知识,但其推理方式仍采用跨所有用户完全相同的通用静态系统提示。这种不对称性是根本性瓶颈:当推荐失败时,Agent会更新其对用户偏好的记忆,却从未审视导致失败的决策逻辑,导致其推理过程无论积累多少错误都保持结构不变。为突破这一瓶颈,我们提出SAGER(面向个性化推荐的自我进化Agent)——首个为每位用户配备专属策略技能的推荐Agent框架。该策略技能是一种结构化自然语言文档,编码个性化决策原则,并通过交互持续进化。SAGER引入双表征技能架构,将丰富的演化基质与最小化的推理时注入相解耦;引入增量对比思维链引擎,通过对比已接受项与未选项来诊断推理缺陷,同时保留累积先验;并引入技能增强的列表式推理,在演化技能提供真正判别价值的维度上创建细粒度决策边界。在四个公开基准上的实验表明,SAGER实现了最先进性能,其增益与记忆积累正交,证实了对推理过程本身进行个性化是推荐改进的一个质性不同的来源。