Rethinking Node-wise Propagation for Large-scale Graph Learning

Scalable graph neural networks (GNNs) have emerged as a promising technique, which exhibits superior predictive performance and high running efficiency across numerous large-scale graph-based web applications. However, (i) Most scalable GNNs tend to treat all nodes in graphs with the same propagation rules, neglecting their topological uniqueness; (ii) Existing node-wise propagation optimization strategies are insufficient on web-scale graphs with intricate topology, where a full portrayal of nodes' local properties is required. Intuitively, different nodes in web-scale graphs possess distinct topological roles, and therefore propagating them indiscriminately or neglect local contexts may compromise the quality of node representations. This intricate topology in web-scale graphs cannot be matched by small-scale scenarios. To address the above issues, we propose \textbf{A}daptive \textbf{T}opology-aware \textbf{P}ropagation (ATP), which reduces potential high-bias propagation and extracts structural patterns of each node in a scalable manner to improve running efficiency and predictive performance. Remarkably, ATP is crafted to be a plug-and-play node-wise propagation optimization strategy, allowing for offline execution independent of the graph learning process in a new perspective. Therefore, this approach can be seamlessly integrated into most scalable GNNs while remain orthogonal to existing node-wise propagation optimization strategies. Extensive experiments on 12 datasets, including the most representative large-scale ogbn-papers100M, have demonstrated the effectiveness of ATP. Specifically, ATP has proven to be efficient in improving the performance of prevalent scalable GNNs for semi-supervised node classification while addressing redundant computational costs.

翻译：可扩展图神经网络（GNNs）已发展成为一种有前景的技术，在众多基于大规模图的网络应用中展现出优越的预测性能和高运行效率。然而，（i）大多数可扩展GNN倾向于对图中所有节点采用相同的传播规则，忽略了其拓扑独特性；（ii）现有的节点级传播优化策略在具有复杂拓扑结构的网络规模图上存在不足，因为这些图需要全面刻画节点的局部特性。直观而言，网络规模图中的不同节点拥有不同的拓扑角色，因此对它们进行无差别传播或忽略局部上下文可能会损害节点表示质量。网络规模图中的这种复杂拓扑无法通过小规模场景来复现。为解决上述问题，我们提出了自适应拓扑感知传播（ATP），该策略能降低潜在的高偏置传播，并以可扩展的方式提取每个节点的结构模式，从而提高运行效率和预测性能。值得注意的是，ATP被设计为一种即插即用的节点级传播优化策略，能够从全新视角独立于图学习过程进行离线执行。因此，该方法可无缝集成到大多数可扩展GNN中，同时保持与现有节点级传播优化策略的正交性。在12个数据集（包括最具代表性的大规模数据集ogbn-papers100M）上的大量实验证明了ATP的有效性。具体而言，ATP在提升主流可扩展GNN在半监督节点分类任务中的性能方面表现高效，同时能解决冗余计算成本问题。