Large language models (LLMs) are increasingly integrated into recommender systems, motivating recent interest in agentic and reasoning-based recommendation. However, most existing approaches still rely on fixed workflows, applying the same reasoning procedure across diverse recommendation scenarios. In practice, user contexts vary substantially-for example, in cold-start settings or during interest shifts, so an agent should adaptively decide what evidence to gather next rather than following a scripted process. To address this, we propose ChainRec, an agentic recommender that uses a planner to dynamically select reasoning tools. ChainRec builds a standardized Tool Agent Library from expert trajectories. It then trains a planner using supervised fine-tuning and preference optimization to dynamically select tools, decide their order, and determine when to stop. Experiments on AgentRecBench across Amazon, Yelp, and Goodreads show that ChainRec consistently improves Avg HR@{1,3,5} over strong baselines, with especially notable gains in cold-start and evolving-interest scenarios. Ablation studies further validate the importance of tool standardization and preference-optimized planning.
翻译:大型语言模型(LLMs)正日益融入推荐系统,这推动了近期对基于代理与推理的推荐方法的研究兴趣。然而,现有方法大多仍依赖于固定的工作流程,在不同的推荐场景中应用相同的推理过程。实际上,用户情境差异巨大——例如,在冷启动场景或兴趣转移期间,代理应自适应地决定下一步收集何种证据,而非遵循预设的脚本流程。为解决此问题,我们提出了ChainRec,一种利用规划器动态选择推理工具的智能推荐代理。ChainRec从专家轨迹中构建了一个标准化的工具代理库。随后,它通过监督微调和偏好优化来训练一个规划器,以动态选择工具、决定其执行顺序并判断何时停止。在涵盖Amazon、Yelp和Goodreads的AgentRecBench基准上的实验表明,相较于强基线模型,ChainRec在平均HR@{1,3,5}指标上持续提升,在冷启动和兴趣演化场景中增益尤为显著。消融研究进一步验证了工具标准化和基于偏好优化的规划的重要性。