AgenticRec: End-to-End Tool-Integrated Policy Optimization for Ranking-Oriented Recommender Agents

Recommender agents built on Large Language Models offer a promising paradigm for recommendation. However, existing recommender agents typically suffer from a disconnect between intermediate reasoning and final ranking feedback, and are unable to capture fine-grained preferences. To address this, we present AgenticRec, a ranking-oriented agentic recommendation framework that optimizes the entire decision-making trajectory (including intermediate reasoning, tool invocation, and final ranking list generation) under sparse implicit feedback. Our approach makes three key contributions. First, we design a suite of recommendation-specific tools integrated into a ReAct loop to support evidence-grounded reasoning. Second, we propose theoretically unbiased List-Wise Group Relative Policy Optimization (list-wise GRPO) to maximize ranking utility, ensuring accurate credit assignment for complex tool-use trajectories. Third, we introduce Progressive Preference Refinement (PPR) to resolve fine-grained preference ambiguities. By mining hard negatives from ranking violations and applying bidirectional preference alignment, PPR minimizes the convex upper bound of pairwise ranking errors. Experiments on benchmarks confirm that AgenticRec significantly outperforms baselines, validating the necessity of unifying reasoning, tool use, and ranking optimization.

翻译：基于大语言模型的推荐智能体为推荐系统提供了极具前景的范式。然而，现有推荐智能体通常存在中间推理与最终排序反馈脱节的问题，且无法捕捉细粒度偏好。为应对这一挑战，我们提出AgenticRec——一种面向排序的智能体推荐框架，可在稀疏隐式反馈下优化从中间推理、工具调用到最终排序列表生成的全决策轨迹。本文做出三项核心贡献：首先，我们设计了一套集成于ReAct循环中的专用推荐工具，以支持基于证据的推理；其次，提出理论上无偏的列表级组相对策略优化（list-wise GRPO），通过最大化排序效用实现复杂工具调用轨迹的精确信用分配；第三，引入渐进式偏好精炼（PPR）机制解决细粒度偏好歧义问题。通过从排序违规中挖掘难负样本并实施双向偏好对齐，PPR最小化了成对排序误差的凸上界。基准实验证实AgenticRec显著优于基线方法，验证了统一推理、工具使用与排序优化的必要性。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

AgentOps综述：智能体系统运维框架

专知会员服务

19+阅读 · 6月4日

伯克利最新《智能体 AI (Agentic AI)》课程

专知会员服务

49+阅读 · 3月1日

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

44+阅读 · 1月8日

智能体工程（Agent Engineering）

专知会员服务

37+阅读 · 2025年12月31日