Sequential Recommender Systems (SRS) predict the next item of interest based on users' interaction histories and have been widely deployed, but hindered by long-tail problem. Large Language Models (LLMs), with strong semantic understanding and reasoning capabilities, offer a promising way to enrich item semantics and have recently been used as embedding generators. However, two fundamental gaps remain. First, current LLM-based embedding methods fail to exploit the model's inner reasoning capacity. Second, existing methods often inject collaborative signals implicitly via supervised fine-tuning, lacking explicit guidance for collaborative embedding alignment. In this paper, we introduce ReaEmb, a novel framework that resolves both issues via a Latent Reasoning-enhanced Contrastive Learning (LRCL) stage and a Collaborative Reward Reinforcement Learning (CRRL) stage. LRCL exploits the LLMs' inner reasoning capacity through a two-pass forward process with an additional attention module. CRRL subsequently explicitly injects collaborative signals into the LLM via a tailored reinforcement learning. Extensive experiments on three real-world datasets demonstrate superior effectiveness of ReaEmb across multiple SRS models. To ease reproducibility, we release the code online.
翻译:序列推荐系统(SRS)根据用户交互历史预测下一个感兴趣的项目,已被广泛应用,但受长尾问题制约。大语言模型(LLMs)凭借强大的语义理解和推理能力,为丰富项目语义提供了有效途径,近期被用作嵌入生成器。然而,仍存在两个根本性差距:第一,当前基于LLM的嵌入方法未能利用模型内部推理能力;第二,现有方法常通过监督微调隐式注入协作信号,缺乏对协作嵌入对齐的显式指导。本文提出ReaEmb框架,通过潜在推理增强对比学习(LRCL)阶段和协作奖励强化学习(CRRL)阶段解决这两大问题。LRCL通过带额外注意力模块的双向前馈过程挖掘LLMs内部推理能力;CRRL则通过定制化强化学习将协作信号显式注入LLM。在三个真实数据集上的大量实验表明,ReaEmb在多种SRS模型上均具有显著有效性。为便于复现,我们已在线发布代码。