Recommendation systems often suffer from data sparsity caused by limited user-item interactions, which degrade their performance and amplify popularity bias in real-world scenarios. This paper proposes a novel data augmentation framework that leverages Large Language Models (LLMs) and item textual descriptions to enrich interaction data. By few-shot prompting LLMs multiple times to rerank items and aggregating the results via majority voting, we generate high-confidence synthetic user-item interactions, supported by theoretical guarantees based on the concentration of measure. To effectively leverage the augmented data in the context of a graph recommendation system, we integrate it into a graph contrastive learning framework to mitigate distributional shift and alleviate popularity bias. Extensive experiments show that our method improves accuracy and reduces popularity bias, outperforming strong baselines.
翻译:推荐系统常因用户-物品交互有限导致数据稀疏性,这在实际场景中会降低其性能并加剧流行度偏差。本文提出一种新颖的数据增强框架,利用大语言模型(LLMs)与物品文本描述来丰富交互数据。通过少量样本多次提示LLMs对物品进行重排序,并基于多数投票聚合结果,我们生成了高置信度的合成用户-物品交互,该过程由测度集中化的理论保证支撑。为在图推荐系统背景下有效利用增强数据,我们将其整合至图对比学习框架中,以缓解分布偏移并减轻流行度偏差。大量实验表明,本方法在提升准确率与降低流行度偏差方面均优于强基线模型。