基于图注意力网络的向量化上下文感知嵌入协同过滤方法 (Vectorized Context-Aware Embeddings for GAT-Based Collaborative Filtering)

Recommender systems often struggle with data sparsity and cold-start scenarios, limiting their ability to provide accurate suggestions for new or infrequent users. This paper presents a Graph Attention Network (GAT) based Collaborative Filtering (CF) framework enhanced with Large Language Model (LLM) driven context aware embeddings. Specifically, we generate concise textual user profiles and unify item metadata (titles, genres, overviews) into rich textual embeddings, injecting these as initial node features in a bipartite user item graph. To further optimize ranking performance, we introduce a hybrid loss function that combines Bayesian Personalized Ranking (BPR) with a cosine similarity term and robust negative sampling, ensuring explicit negative feedback is distinguished from unobserved data. Experiments on the MovieLens 100k and 1M datasets show consistent improvements over state-of-the-art baselines in Precision, NDCG, and MAP while demonstrating robustness for users with limited interaction history. Ablation studies confirm the critical role of LLM-augmented embeddings and the cosine similarity term in capturing nuanced semantic relationships. Our approach effectively mitigates sparsity and cold-start limitations by integrating LLM-derived contextual understanding into graph-based architectures. Future directions include balancing recommendation accuracy with coverage and diversity, and introducing fairness-aware constraints and interpretability features to enhance system performance further.

翻译：推荐系统常面临数据稀疏性和冷启动场景的挑战，限制了其为新用户或低频用户提供准确推荐的能力。本文提出一种基于图注意力网络（GAT）的协同过滤（CF）框架，该框架通过大型语言模型（LLM）驱动的上下文感知嵌入进行增强。具体而言，我们生成简洁的文本用户画像，并将物品元数据（标题、类型、概述）统一为丰富的文本嵌入，将其作为二分用户-物品图中的初始节点特征注入。为进一步优化排序性能，我们引入一种混合损失函数，该函数将贝叶斯个性化排序（BPR）与余弦相似度项及鲁棒的负采样相结合，确保显式负反馈与未观测数据得以区分。在MovieLens 100k和1M数据集上的实验表明，该方法在精确率、归一化折损累计增益（NDCG）和平均准确率（MAP）上均优于当前最先进的基线模型，同时对交互历史有限的用户展现出鲁棒性。消融研究证实了LLM增强嵌入与余弦相似度项在捕捉细微语义关系中的关键作用。我们的方法通过将LLM衍生的上下文理解整合至基于图的架构中，有效缓解了稀疏性与冷启动限制。未来研究方向包括平衡推荐准确性与覆盖率及多样性，并引入公平性约束与可解释性特征以进一步提升系统性能。

相关内容

协同过滤

关注 224

协同过滤（英语：Collaborative Filtering），简单来说是利用某兴趣相投、拥有共同经验之群体的喜好来推荐用户感兴趣的信息，个人透过合作的机制给予信息相当程度的回应（如评分）并记录下来以达到过滤的目的进而帮助别人筛选信息，回应不一定局限于特别感兴趣的，特别不感兴趣信息的纪录也相当重要。协同过滤又可分为评比（rating）或者群体过滤（social filtering）。其后成为电子商务当中很重要的一环，即根据某顾客以往的购买行为以及从具有相似购买行为的顾客群的购买行为去推荐这个顾客其“可能喜欢的品项”，也就是借由社群的喜好提供个人化的信息、商品等的推荐服务。除了推荐之外，近年来也发展出数学运算让系统自动计算喜好的强弱进而去芜存菁使得过滤的内容更有依据，也许不是百分之百完全准确，但由于加入了强弱的评比让这个概念的应用更为广泛，除了电子商务之外尚有信息检索领域、网络个人影音柜、个人书架等的应用等。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日