The problem of data sparsity has long been a challenge in recommendation systems, and previous studies have attempted to address this issue by incorporating side information. However, this approach often introduces side effects such as noise, availability issues, and low data quality, which in turn hinder the accurate modeling of user preferences and adversely impact recommendation performance. In light of the recent advancements in large language models (LLMs), which possess extensive knowledge bases and strong reasoning capabilities, we propose a novel framework called LLMRec that enhances recommender systems by employing three simple yet effective LLM-based graph augmentation strategies. Our approach leverages the rich content available within online platforms (e.g., Netflix, MovieLens) to augment the interaction graph in three ways: (i) reinforcing user-item interaction egde, (ii) enhancing the understanding of item node attributes, and (iii) conducting user node profiling, intuitively from the natural language perspective. By employing these strategies, we address the challenges posed by sparse implicit feedback and low-quality side information in recommenders. Besides, to ensure the quality of the augmentation, we develop a denoised data robustification mechanism that includes techniques of noisy implicit feedback pruning and MAE-based feature enhancement that help refine the augmented data and improve its reliability. Furthermore, we provide theoretical analysis to support the effectiveness of LLMRec and clarify the benefits of our method in facilitating model optimization. Experimental results on benchmark datasets demonstrate the superiority of our LLM-based augmentation approach over state-of-the-art techniques. To ensure reproducibility, we have made our code and augmented data publicly available at: https://github.com/HKUDS/LLMRec.git
翻译:数据稀疏问题长期困扰推荐系统,以往研究尝试通过引入辅助信息来解决,但该方式往往引入噪声、可用性差、数据质量低等副作用,进而阻碍用户偏好的精确建模,并对推荐性能产生负面影响。鉴于大语言模型(LLM)具备广泛知识库与强推理能力的最新进展,我们提出一种名为LLMRec的创新框架,通过采用三种简单而有效的基于LLM的图增强策略来优化推荐系统。我们的方法利用在线平台(例如Netflix、MovieLens)的丰富内容,从自然语言视角以三种方式增强交互图:(i)强化用户-物品交互边,(ii)提升物品节点属性理解,(iii)进行用户节点画像构建。通过这些策略,我们解决了推荐系统中稀疏隐式反馈与低质量辅助信息带来的挑战。此外,为确保增强质量,我们开发了去噪数据鲁棒化机制,包含噪声隐式反馈修剪与基于MAE的特征增强技术,有助于精炼增强数据并提升其可靠性。进一步,我们提供理论分析支持LLMRec的有效性,并阐明该方法在促进模型优化方面的优势。在基准数据集上的实验结果表明,基于LLM的增强方法优于现有最先进技术。为确保可复现性,我们已在https://github.com/HKUDS/LLMRec.git 公开代码与增强数据。