Generative recommendation has emerged as a promising paradigm aimed at augmenting recommender systems with recent advancements in generative artificial intelligence. This task has been formulated as a sequence-to-sequence generation process, wherein the input sequence encompasses data pertaining to the user's previously interacted items, and the output sequence denotes the generative identifier for the suggested item. However, existing generative recommendation approaches still encounter challenges in (i) effectively integrating user-item collaborative signals and item content information within a unified generative framework, and (ii) executing an efficient alignment between content information and collaborative signals. In this paper, we introduce content-based collaborative generation for recommender systems, denoted as ColaRec. To capture collaborative signals, the generative item identifiers are derived from a pretrained collaborative filtering model, while the user is represented through the aggregation of interacted items' content. Subsequently, the aggregated textual description of items is fed into a language model to encapsulate content information. This integration enables ColaRec to amalgamate collaborative signals and content information within an end-to-end framework. Regarding the alignment, we propose an item indexing task to facilitate the mapping between the content-based semantic space and the interaction-based collaborative space. Additionally, a contrastive loss is introduced to ensure that items with similar collaborative GIDs possess comparable content representations, thereby enhancing alignment. To validate the efficacy of ColaRec, we conduct experiments on three benchmark datasets. Empirical results substantiate the superior performance of ColaRec.
翻译:生成式推荐已成为一种有前景的范式,旨在利用生成式人工智能的最新进展增强推荐系统。该任务被形式化为序列到序列的生成过程,其中输入序列包含用户历史交互项目的数据,输出序列则表示推荐项目的生成标识符。然而,现有的生成式推荐方法仍面临挑战:(i) 在统一的生成框架中有效整合用户-项目协同信号与项目内容信息,以及(ii) 实现内容信息与协同信号之间的高效对齐。在本文中,我们提出了一种基于内容的协同生成推荐系统,称为ColaRec。为了捕获协同信号,生成的项目标识符源自预训练的协同过滤模型,而用户则通过交互项目内容的聚合来表示。随后,项目的聚合文本描述被输入语言模型以封装内容信息。这种整合使ColaRec能够在端到端框架中融合协同信号与内容信息。关于对齐问题,我们提出了一项项目索引任务,以促进基于内容的语义空间与基于交互的协同空间之间的映射。此外,引入对比损失确保具有相似协同GID的项目具有可比的内容表示,从而增强对齐。为了验证ColaRec的有效性,我们在三个基准数据集上进行了实验。实证结果证实了ColaRec的优越性能。