The problem of data sparsity has long been a challenge in recommendation systems, and previous studies have attempted to address this issue by incorporating side information. However, this approach often introduces side effects such as noise, availability issues, and low data quality, which in turn hinder the accurate modeling of user preferences and adversely impact recommendation performance. In light of the recent advancements in large language models (LLMs), which possess extensive knowledge bases and strong reasoning capabilities, we propose a novel framework called LLMRec that enhances recommender systems by employing three simple yet effective LLM-based graph augmentation strategies. Our approach leverages the rich content available within online platforms (e.g., Netflix, MovieLens) to augment the interaction graph in three ways: (i) reinforcing user-item interaction egde, (ii) enhancing the understanding of item node attributes, and (iii) conducting user node profiling, intuitively from the natural language perspective. By employing these strategies, we address the challenges posed by sparse implicit feedback and low-quality side information in recommenders. Besides, to ensure the quality of the augmentation, we develop a denoised data robustification mechanism that includes techniques of noisy implicit feedback pruning and MAE-based feature enhancement that help refine the augmented data and improve its reliability. Furthermore, we provide theoretical analysis to support the effectiveness of LLMRec and clarify the benefits of our method in facilitating model optimization. Experimental results on benchmark datasets demonstrate the superiority of our LLM-based augmentation approach over state-of-the-art techniques. To ensure reproducibility, we have made our code and augmented data publicly available at: https://github.com/HKUDS/LLMRec.git
翻译:数据稀疏性问题长期困扰推荐系统,以往研究尝试通过引入边信息来缓解该问题。然而,这种方法常带来噪声、可用性不足及数据质量低下等副作用,反而阻碍对用户偏好的精准建模,并对推荐性能产生负面影响。鉴于大语言模型(LLM)拥有广泛的知识库与强大的推理能力,我们提出名为LLMRec的新框架,通过三种简单而有效的基于LLM的图增强策略来提升推荐系统性能。我们的方法利用在线平台(如Netflix、MovieLens)的丰富内容,从自然语言视角以三种方式增强交互图: (i) 强化用户-物品交互边,(ii) 提升物品节点属性的理解,(iii) 进行用户节点画像。这些策略解决了推荐系统中隐式反馈稀疏和边信息质量低下的挑战。此外,为确保增强质量,我们开发了去噪数据鲁棒化机制,包括噪声隐式反馈剪枝和基于MAE的特征增强技术,以精炼增强数据并提升其可靠性。同时,我们提供理论分析支持LLMRec的有效性,并阐明该方法在促进模型优化方面的优势。基准数据集上的实验结果表明,我们的LLM增强方法优于现有最先进技术。为保证可复现性,我们已在 https://github.com/HKUDS/LLMRec.git 公开代码与增强数据。