Representation Learning with Large Language Models for Recommendation

Recommender systems have seen significant advancements with the influence of deep learning and graph neural networks, particularly in capturing complex user-item relationships. However, these graph-based recommenders heavily depend on ID-based data, potentially disregarding valuable textual information associated with users and items, resulting in less informative learned representations. Moreover, the utilization of implicit feedback data introduces potential noise and bias, posing challenges for the effectiveness of user preference learning. While the integration of large language models (LLMs) into traditional ID-based recommenders has gained attention, challenges such as scalability issues, limitations in text-only reliance, and prompt input constraints need to be addressed for effective implementation in practical recommender systems. To address these challenges, we propose a model-agnostic framework RLMRec that aims to enhance existing recommenders with LLM-empowered representation learning. It proposes a recommendation paradigm that integrates representation learning with LLMs to capture intricate semantic aspects of user behaviors and preferences. RLMRec incorporates auxiliary textual signals, develops a user/item profiling paradigm empowered by LLMs, and aligns the semantic space of LLMs with the representation space of collaborative relational signals through a cross-view alignment framework. This work further establish a theoretical foundation demonstrating that incorporating textual signals through mutual information maximization enhances the quality of representations. In our evaluation, we integrate RLMRec with state-of-the-art recommender models, while also analyzing its efficiency and robustness to noise data. Our implementation codes are available at https://github.com/HKUDS/RLMRec.

翻译：推荐系统在深度学习与图神经网络的影响下取得了显著进展，尤其在捕捉复杂用户-物品关系方面。然而，这些基于图的推荐器高度依赖基于ID的数据，可能忽略与用户及物品相关的宝贵文本信息，导致所学表征的信息量不足。此外，隐式反馈数据的利用会引入潜在噪声与偏差，给用户偏好学习的有效性带来挑战。尽管将大语言模型（LLMs）融入传统基于ID的推荐器已引起关注，但在实际推荐系统中实施时仍需解决扩展性问题、仅依赖文本的局限性以及提示输入约束等挑战。为应对这些问题，我们提出了一种模型无关框架RLMRec，旨在通过LLM赋能的表征学习增强现有推荐器。该框架提出了一种将表征学习与LLMs相结合的推荐范式，以捕捉用户行为与偏好的复杂语义层面。RLMRec整合了辅助文本信号，开发了由LLMs驱动的用户/物品画像范式，并通过跨视图对齐框架将LLMs的语义空间与协作关系信号的表征空间进行对齐。本研究进一步建立了理论基础，证明通过互信息最大化引入文本信号可提升表征质量。在评估中，我们将RLMRec与最新推荐模型集成，同时分析了其效率与对噪声数据的鲁棒性。我们的实现代码已公开于https://github.com/HKUDS/RLMRec。