The successful integration of graph neural networks into recommender systems (RSs) has led to a novel paradigm in collaborative filtering (CF), graph collaborative filtering (graph CF). By representing user-item data as an undirected, bipartite graph, graph CF utilizes short- and long-range connections to extract collaborative signals that yield more accurate user preferences than traditional CF methods. Although the recent literature highlights the efficacy of various algorithmic strategies in graph CF, the impact of datasets and their topological features on recommendation performance is yet to be studied. To fill this gap, we propose a topology-aware analysis of graph CF. In this study, we (i) take some widely-adopted recommendation datasets and use them to generate a large set of synthetic sub-datasets through two state-of-the-art graph sampling methods, (ii) measure eleven of their classical and topological characteristics, and (iii) estimate the accuracy calculated on the generated sub-datasets considering four popular and recent graph-based RSs (i.e., LightGCN, DGCF, UltraGCN, and SVD-GCN). Finally, the investigation presents an explanatory framework that reveals the linear relationships between characteristics and accuracy measures. The results, statistically validated under different graph sampling settings, confirm the existence of solid dependencies between topological characteristics and accuracy in the graph-based recommendation, offering a new perspective on how to interpret graph CF.
翻译:图神经网络成功融入推荐系统,催生了协同过滤的新范式——图协同过滤。通过将用户-项目数据建模为无向二分图,图协同过滤利用短程与长程连接提取协同信号,相较于传统协同过滤方法能更精确地捕捉用户偏好。尽管近期文献突显了图协同过滤中多种算法策略的有效性,但数据集及其拓扑特征对推荐性能的影响仍有待研究。为填补这一空白,我们提出了一种拓扑感知的图协同过滤分析方法。本研究:(i) 选取若干广泛采用的推荐数据集,通过两种主流图采样方法生成大规模合成子数据集;(ii) 测量这些子数据集的十一种经典与拓扑特征;(iii) 基于四种前沿图推荐系统(LightGCN、DGCF、UltraGCN及SVD-GCN)评估合成子数据集的推荐精度。最终,本研究构建了解释性框架,揭示了特征与精度指标间的线性关系。经不同图采样设置下的统计验证,结果确认了图推荐中拓扑特征与精度之间存在显著依赖关系,为解读图协同过滤提供了全新视角。