Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

Recently, large language models (LLMs) have shown great potential in recommender systems, either improving existing recommendation models or serving as the backbone. However, there exists a large semantic gap between LLMs and recommender systems, since items to be recommended are often indexed by discrete identifiers (item ID) out of the LLM's vocabulary. In essence, LLMs capture language semantics while recommender systems imply collaborative semantics, making it difficult to sufficiently leverage the model capacity of LLMs for recommendation. To address this challenge, in this paper, we propose a new LLM-based recommendation model called LC-Rec, which can better integrate language and collaborative semantics for recommender systems. Our approach can directly generate items from the entire item set for recommendation, without relying on candidate items. Specifically, we make two major contributions in our approach. For item indexing, we design a learning-based vector quantization method with uniform semantic mapping, which can assign meaningful and non-conflicting IDs (called item indices) for items. For alignment tuning, we propose a series of specially designed tuning tasks to enhance the integration of collaborative semantics in LLMs. Our fine-tuning tasks enforce LLMs to deeply integrate language and collaborative semantics (characterized by the learned item indices), so as to achieve an effective adaptation to recommender systems. Extensive experiments demonstrate the effectiveness of our method, showing that our approach can outperform a number of competitive baselines including traditional recommenders and existing LLM-based recommenders. Our code is available at https://github.com/RUCAIBox/LC-Rec/.

翻译：近期，大型语言模型（LLMs）在推荐系统中展现出巨大潜力，既能提升现有推荐模型性能，也可作为核心架构。然而，LLMs与推荐系统之间存在显著的语义鸿沟：待推荐物品通常以离散标识符（物品ID）索引，这些标识符不在LLMs词汇表中。本质上，LLMs捕捉的是语言语义，而推荐系统隐含协同语义，这导致难以充分利用LLMs的模型能力进行推荐。为解决这一挑战，本文提出一种基于LLMs的新型推荐模型LC-Rec，该模型能更好地将语言语义与协同语义整合至推荐系统。我们的方法可直接从全体物品集合中生成推荐结果，无需依赖候选物品。具体而言，本研究有两项核心贡献：在物品索引方面，我们设计了一种基于学习的向量量化方法，通过统一语义映射为物品分配有意义且无冲突的ID（称为物品索引）；在对齐微调方面，我们提出一系列特制的微调任务以增强协同语义在LLMs中的融合。这些微调任务迫使LLMs深度整合语言语义与（由学习得到的物品索引表征的）协同语义，从而实现对推荐系统的有效适配。大量实验证明了该方法有效性，其性能优于包括传统推荐模型和现有基于LLMs推荐模型在内的多个强基线方法。我们的代码已开源至https://github.com/RUCAIBox/LC-Rec/。