How Expressive are Graph Neural Networks in Recommendation?

Graph Neural Networks (GNNs) have demonstrated superior performance on various graph learning tasks, including recommendation, where they leverage user-item collaborative filtering signals in graphs. However, theoretical formulations of their capability are scarce, despite their empirical effectiveness in state-of-the-art recommender models. Recently, research has explored the expressiveness of GNNs in general, demonstrating that message passing GNNs are at most as powerful as the Weisfeiler-Lehman test, and that GNNs combined with random node initialization are universal. Nevertheless, the concept of "expressiveness" for GNNs remains vaguely defined. Most existing works adopt the graph isomorphism test as the metric of expressiveness, but this graph-level task may not effectively assess a model's ability in recommendation, where the objective is to distinguish nodes of different closeness. In this paper, we provide a comprehensive theoretical analysis of the expressiveness of GNNs in recommendation, considering three levels of expressiveness metrics: graph isomorphism (graph-level), node automorphism (node-level), and topological closeness (link-level). We propose the topological closeness metric to evaluate GNNs' ability to capture the structural distance between nodes, which aligns closely with the objective of recommendation. To validate the effectiveness of this new metric in evaluating recommendation performance, we introduce a learning-less GNN algorithm that is optimal on the new metric and can be optimal on the node-level metric with suitable modification. We conduct extensive experiments comparing the proposed algorithm against various types of state-of-the-art GNN models to explore the explainability of the new metric in the recommendation task. For reproducibility, implementation codes are available at https://github.com/HKUDS/GTE.

翻译：图神经网络（GNNs）已在包括推荐在内的多种图学习任务中展现出卓越性能，通过利用图中用户-物品协同过滤信号实现效果提升。然而，尽管其在最先进的推荐模型中具有实证有效性，其能力的理论形式化研究仍十分匮乏。近期研究开始探索GNN的通用表达能力，证明消息传递型GNN最多与Weisfeiler-Lehman测试等价，而结合随机节点初始化的GNN具有普适性。但GNN的"表达能力"概念仍缺乏明确定义。现有工作多采用图同构测试作为表达能力度量标准，但这种图级任务可能无法有效评估模型在推荐中的能力——推荐的核心目标是区分不同亲近度的节点。本文从三个层次的表达能力度量标准出发，对推荐场景下GNN的表达能力进行了全面的理论分析：图同构（图级）、节点自同构（节点级）和拓扑亲近度（链路级）。我们提出拓扑亲近度度量标准，用以评估GNN捕获节点间结构距离的能力，这与推荐任务目标高度契合。为验证该新度量在评估推荐性能中的有效性，我们提出一种无学习机制的GNN算法，该算法在新度量下最优，且经过适当修改后可在节点级度量下达到最优。通过广泛实验，我们将该算法与多种类型的最先进GNN模型进行对比，探索新度量在推荐任务中的可解释性。为保障可重复性，实现代码已开源于https://github.com/HKUDS/GTE。