GraphRAG-IRL: Personalized Recommendation with Graph-Grounded Inverse Reinforcement Learning and LLM Re-ranking

Personalized recommendation requires models that capture sequential user preferences while remaining robust to sparse feedback and semantic ambiguity. Recent work has explored large language models (LLMs) as recommenders and re-rankers, but pure prompt-based ranking often suffers from poor calibration, sensitivity to candidate ordering, and popularity bias. These limitations make LLMs useful semantic reasoners, but unreliable as standalone ranking engines. We present \textbf{GraphRAG-IRL}, a hybrid recommendation framework that combines graph-grounded feature construction, inverse reinforcement learning (IRL), and persona-guided LLM re-ranking. Our method constructs a heterogeneous knowledge graph over items, categories, and concepts, retrieves both individual and community preference context, and uses these signals to train a Maximum Entropy IRL model for calibrated pre-ranking. An LLM is then applied only to a short candidate list, where persona-guided prompts provide complementary semantic judgments that are fused with IRL rankings. Experiments show that GraphRAG-IRL is a strong standalone recommender: IRL-MLP with GraphRAG improves NDCG@10 by 15.7\% on MovieLens and 16.6\% on KuaiRand over supervised baselines. The results also show that IRL and GraphRAG are superadditive, with the combined gain exceeding the sum of their individual improvements. Persona-guided LLM fusion further improves ranking quality, yielding up to 16.8\% NDCG@10 improvement over the IRL-only baseline on MovieLens ml-1m, while score fusion on KuaiRand provides consistent gains of 4--6\% across LLM providers.

翻译：个性化推荐需要模型在捕捉用户序列偏好的同时，对稀疏反馈与语义歧义保持鲁棒性。近期研究探索了大语言模型（LLM）作为推荐器和重排序器的应用，但纯提示式排序常存在校准不足、对候选顺序敏感以及流行度偏差等问题。这些局限使LLM成为有效的语义推理器，却难以作为独立排序引擎可靠运行。我们提出**GraphRAG-IRL**混合推荐框架，融合基于图的特征构建、逆强化学习（IRL）与人格引导的LLM重排序。该方法构建包含物品、类别与概念的异质知识图谱，提取个体及社区偏好上下文，并利用这些信号训练最大熵IRL模型以实现校准后的预排序。随后，仅对短候选列表应用LLM，通过人格引导提示提供补充语义判断，与IRL排序结果融合。实验表明，GraphRAG-IRL作为独立推荐器性能强劲：基于GraphRAG的IRL-MLP在MovieLens和KuaiRand数据集上相较于监督基线分别实现NDCG@10提升15.7%和16.6%。结果同时显示IRL与GraphRAG具有超可加性——组合提升幅度超过各自改进之和。在MovieLens ml-1m数据集上，人格引导的LLM融合进一步优化排序质量，较纯IRL基线实现NDCG@10最高提升16.8%；而在KuaiRand数据集上，分数融合在不同LLM供应商中持续带来4-6%的增益。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【WWW2025】G-Refer：基于图检索增强的大型语言模型用于可解释推荐

专知会员服务

13+阅读 · 2025年4月8日

个性化大型语言模型综述：进展与未来方向

专知会员服务

43+阅读 · 2025年2月18日

关于大语言模型驱动的推荐系统智能体的综述

专知会员服务

29+阅读 · 2025年2月17日

大规模语言模型增强推荐系统：分类、趋势、应用与未来

专知会员服务

40+阅读 · 2024年12月22日