Large-scale industrial recommenders typically use a fixed multi-stage pipeline (recall, ranking, re-ranking) and have progressed from collaborative filtering to deep and large pre-trained models. However, both multi-stage and so-called One Model designs remain essentially static: models are black boxes, and system improvement relies on manual hypotheses and engineering, which is hard to scale under heterogeneous data and multi-objective business constraints. We propose an Agentic Recommender System (AgenticRS) that reorganizes key modules as agents. Modules are promoted to agents only when they form a functionally closed loop, can be independently evaluated, and possess an evolvable decision space. For model agents, we outline two self-evolution mechanisms: reinforcement learning style optimization in well-defined action spaces, and large language model based generation and selection of new architectures and training schemes in open-ended design spaces. We further distinguish individual evolution of single agents from compositional evolution over how multiple agents are selected and connected, and use a layered inner and outer reward design to couple local optimization with global objectives. This provides a concise blueprint for turning static pipelines into self-evolving agentic recommender systems.
翻译:大规模工业推荐系统通常采用固定的多阶段流水线(召回、排序、重排序),其发展路径从协同过滤演进至深度及大规模预训练模型。然而,多阶段设计及所谓"单一模型"设计本质上仍属静态:模型如同黑箱,系统优化依赖人工假设与工程调优,难以在异质数据与多目标业务约束下实现规模化。我们提出智能体推荐系统(AgenticRS),将核心模块重组为智能体。仅当模块具备功能闭环、可独立评估且拥有可演进决策空间时,才将其提升为智能体。针对模型智能体,我们提出两种自我进化机制:在明确定义动作空间中的强化学习式优化,以及在开放设计空间中基于大语言模型的新架构与训练方案生成与选择。进一步区分单智能体个体进化与多智能体选择连接构成的组合进化,并采用分层内外奖励设计实现局部优化与全局目标的耦合。这为将静态流水线转化为自演进智能体推荐系统提供了简洁蓝图。