Representation Learning with Large Language Models for Recommendation

Recommender systems have seen significant advancements with the influence of deep learning and graph neural networks, particularly in capturing complex user-item relationships. However, these graph-based recommenders heavily depend on ID-based data, potentially disregarding valuable textual information associated with users and items, resulting in less informative learned representations. Moreover, the utilization of implicit feedback data introduces potential noise and bias, posing challenges for the effectiveness of user preference learning. While the integration of large language models (LLMs) into traditional ID-based recommenders has gained attention, challenges such as scalability issues, limitations in text-only reliance, and prompt input constraints need to be addressed for effective implementation in practical recommender systems. To address these challenges, we propose a model-agnostic framework RLMRec that aims to enhance existing recommenders with LLM-empowered representation learning. It proposes a recommendation paradigm that integrates representation learning with LLMs to capture intricate semantic aspects of user behaviors and preferences. RLMRec incorporates auxiliary textual signals, develops a user/item profiling paradigm empowered by LLMs, and aligns the semantic space of LLMs with the representation space of collaborative relational signals through a cross-view alignment framework. This work further establish a theoretical foundation demonstrating that incorporating textual signals through mutual information maximization enhances the quality of representations. In our evaluation, we integrate RLMRec with state-of-the-art recommender models, while also analyzing its efficiency and robustness to noise data. Our implementation codes are available at https://github.com/HKUDS/RLMRec.

翻译：随着深度学习和图神经网络的推动，推荐系统在捕捉复杂用户-物品关系方面取得了显著进展。然而，这些基于图的推荐模型严重依赖基于标识符的数据，可能忽略了与用户和物品相关的宝贵文本信息，导致学习到的表征信息量不足。此外，隐式反馈数据的使用引入了潜在的噪声和偏差，为用户偏好学习的有效性带来了挑战。尽管将大语言模型整合到传统基于标识符的推荐系统中已受到关注，但在实际推荐系统中有效实施仍需解决可扩展性问题、纯文本依赖的局限性以及提示输入约束等挑战。为应对这些挑战，我们提出了一个模型无关的框架RLMRec，旨在通过大语言模型增强的表征学习来改进现有推荐系统。该框架提出了一种将表征学习与大语言模型相结合的推荐范式，以捕捉用户行为和偏好的复杂语义特征。RLMRec整合了辅助文本信号，开发了由大语言模型支持的用户/物品画像构建范式，并通过跨视图对齐框架将大语言模型的语义空间与协同关系信号的表征空间进行对齐。本研究进一步建立了理论证明，表明通过互信息最大化整合文本信号能够提升表征质量。在评估中，我们将RLMRec与前沿推荐模型相结合，同时分析了其效率和对噪声数据的鲁棒性。我们的实现代码已发布于https://github.com/HKUDS/RLMRec。