Representation Learning with Large Language Models for Recommendation

Recommender systems have seen significant advancements with the influence of deep learning and graph neural networks, particularly in capturing complex user-item relationships. However, these graph-based recommenders heavily depend on ID-based data, potentially disregarding valuable textual information associated with users and items, resulting in less informative learned representations. Moreover, the utilization of implicit feedback data introduces potential noise and bias, posing challenges for the effectiveness of user preference learning. While the integration of large language models (LLMs) into traditional ID-based recommenders has gained attention, challenges such as scalability issues, limitations in text-only reliance, and prompt input constraints need to be addressed for effective implementation in practical recommender systems. To address these challenges, we propose a model-agnostic framework RLMRec that aims to enhance existing recommenders with LLM-empowered representation learning. It proposes a recommendation paradigm that integrates representation learning with LLMs to capture intricate semantic aspects of user behaviors and preferences. RLMRec incorporates auxiliary textual signals, develops a user/item profiling paradigm empowered by LLMs, and aligns the semantic space of LLMs with the representation space of collaborative relational signals through a cross-view alignment framework. This work further establish a theoretical foundation demonstrating that incorporating textual signals through mutual information maximization enhances the quality of representations. In our evaluation, we integrate RLMRec with state-of-the-art recommender models, while also analyzing its efficiency and robustness to noise data. Our implementation codes are available at https://github.com/HKUDS/RLMRec.

翻译：推荐系统在深度学习和图神经网络的影响下取得了显著进展，特别是在捕捉复杂的用户-物品关系方面。然而，这些基于图的推荐器严重依赖ID数据，可能忽略与用户和物品相关的宝贵文本信息，导致学习到的表示信息量不足。此外，隐式反馈数据的利用引入了潜在噪声和偏差，对用户偏好学习的有效性构成挑战。尽管将大语言模型（LLM）集成到传统的基于ID的推荐器中已引起关注，但在实际推荐系统中有效实施时，仍需解决可扩展性问题、仅依赖文本的局限性以及提示输入约束等挑战。为解决这些问题，我们提出了一种模型无关框架RLMRec，旨在通过LLM增强的表示学习来提升现有推荐器。该框架提出了一种将表示学习与LLM相结合的推荐范式，以捕捉用户行为和偏好的复杂语义方面。RLMRec整合辅助文本信号，开发了由LLM驱动的用户/物品画像范式，并通过跨视图对齐框架将LLM的语义空间与协同关系信号的表示空间对齐。本文进一步建立了理论基础，证明通过互信息最大化整合文本信号能够提升表示质量。在评估中，我们将RLMRec与最新推荐模型集成，同时分析其对噪声数据的效率和鲁棒性。我们的实现代码可在https://github.com/HKUDS/RLMRec获取。