The problem of data sparsity has long been a challenge in recommendation systems, and previous studies have attempted to address this issue by incorporating side information. However, this approach often introduces side effects such as noise, availability issues, and low data quality, which in turn hinder the accurate modeling of user preferences and adversely impact recommendation performance. In light of the recent advancements in large language models (LLMs), which possess extensive knowledge bases and strong reasoning capabilities, we propose a novel framework called LLMRec that enhances recommender systems by employing three simple yet effective LLM-based graph augmentation strategies. Our approach leverages the rich content available within online platforms (e.g., Netflix, MovieLens) to augment the interaction graph in three ways: (i) reinforcing user-item interaction egde, (ii) enhancing the understanding of item node attributes, and (iii) conducting user node profiling, intuitively from the natural language perspective. By employing these strategies, we address the challenges posed by sparse implicit feedback and low-quality side information in recommenders. Besides, to ensure the quality of the augmentation, we develop a denoised data robustification mechanism that includes techniques of noisy implicit feedback pruning and MAE-based feature enhancement that help refine the augmented data and improve its reliability. Furthermore, we provide theoretical analysis to support the effectiveness of LLMRec and clarify the benefits of our method in facilitating model optimization. Experimental results on benchmark datasets demonstrate the superiority of our LLM-based augmentation approach over state-of-the-art techniques. To ensure reproducibility, we have made our code and augmented data publicly available at: https://github.com/HKUDS/LLMRec.git
翻译:数据稀疏问题长期困扰推荐系统,现有研究尝试通过引入辅助信息解决该问题,但这种方法常引入噪声、可用性差、数据质量低等副作用,进而阻碍用户偏好的准确建模并影响推荐性能。鉴于大语言模型(LLM)具备广泛知识库与强大推理能力的最新进展,我们提出名为LLMRec的创新框架,采用三种简单有效的基于LLM的图增强策略来提升推荐系统性能。该方法利用在线平台(如Netflix、MovieLens)丰富的语料内容,从自然语言视角以三种方式增强交互图:(i)强化用户-项目交互边,(ii)提升项目节点属性理解,(iii)进行用户节点画像分析。通过这些策略,我们解决了推荐系统中隐式反馈稀疏与辅助信息质量低下的挑战。此外,为确保增强质量,我们开发了去噪数据鲁棒机制,包含噪声隐式反馈剪枝与基于MAE的特征增强技术,用于优化增强数据可靠性。进一步,我们提供理论分析支撑LLMRec的有效性,阐明该方法在促进模型优化方面的优势。在基准数据集上的实验结果表明,基于LLM的增强方法优于现有先进技术。为保障可复现性,我们已在 https://github.com/HKUDS/LLMRec.git 公开代码与增强数据。