推理排序：利用大型语言模型进行推荐的端到端解决方案 (Reasoning to Rank: An End-to-End Solution for Exploiting Large Language Models for Recommendation)

Recommender systems are tasked to infer users' evolving preferences and rank items aligned with their intents, which calls for in-depth reasoning beyond pattern-based scoring. Recent efforts start to leverage large language models (LLMs) for recommendation, but how to effectively optimize the model for improved recommendation utility is still under explored. In this work, we propose Reasoning to Rank, an end-to-end training framework that internalizes recommendation utility optimization into the learning of step-by-step reasoning in LLMs. To avoid position bias in LLM reasoning and enable direct optimization of the reasoning process, our framework performs reasoning at the user-item level and employs reinforcement learning for end-to-end training of the LLM. Experiments on three Amazon datasets and a large-scale industrial dataset showed consistent gains over strong conventional and LLM-based solutions. Extensive in-depth analyses validate the necessity of the key components in the proposed framework and shed lights on the future developments of this line of work.

翻译：推荐系统的核心任务是推断用户不断演变的偏好，并对符合其意图的物品进行排序，这需要超越基于模式评分的深度推理。近期研究开始利用大型语言模型（LLMs）进行推荐，但如何有效优化模型以提升推荐效用仍待深入探索。本研究提出"推理排序"——一种端到端训练框架，将推荐效用优化内化于LLMs逐步推理的学习过程中。为规避LLM推理中的位置偏差并实现对推理过程的直接优化，本框架在用户-物品层级执行推理，并采用强化学习对LLM进行端到端训练。在三个亚马逊数据集及大规模工业数据集上的实验表明，该方法相较于传统强基线及现有LLM解决方案均取得持续性能提升。深入的全面分析验证了所提框架中关键组件的必要性，并为该研究方向的未来发展提供了启示。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

大语言模型的智能体化推理

专知会员服务

33+阅读 · 1月21日

【WWW2025】G-Refer：基于图检索增强的大型语言模型用于可解释推荐

专知会员服务

13+阅读 · 2025年4月8日

高效推理的集约化探索：大语言模型推理优化综述

专知会员服务

33+阅读 · 2025年4月1日

关于大语言模型驱动的推荐系统智能体的综述

专知会员服务

28+阅读 · 2025年2月17日