Improving Graph Neural Networks on Multi-node Tasks with Labeling Tricks

In this paper, we provide a theory of using graph neural networks (GNNs) for \textit{multi-node representation learning}, where we are interested in learning a representation for a set of more than one node such as a link. Existing GNNs are mainly designed to learn single-node representations. When we want to learn a node-set representation involving multiple nodes, a common practice in previous works is to directly aggregate the single-node representations obtained by a GNN. In this paper, we show a fundamental limitation of such an approach, namely the inability to capture the dependence among multiple nodes in a node set, and argue that directly aggregating individual node representations fails to produce an effective joint representation for multiple nodes. A straightforward solution is to distinguish target nodes from others. Formalizing this idea, we propose \text{labeling trick}, which first labels nodes in the graph according to their relationships with the target node set before applying a GNN and then aggregates node representations obtained in the labeled graph for multi-node representations. The labeling trick also unifies a few previous successful works for multi-node representation learning, including SEAL, Distance Encoding, ID-GNN, and NBFNet. Besides node sets in graphs, we also extend labeling tricks to posets, subsets and hypergraphs. Experiments verify that the labeling trick technique can boost GNNs on various tasks, including undirected link prediction, directed link prediction, hyperedge prediction, and subgraph prediction. Our work explains the superior performance of previous node-labeling-based methods and establishes a theoretical foundation for using GNNs for multi-node representation learning.

翻译：摘要：本文提出了使用图神经网络（GNN）进行**多节点表示学习**的理论框架，旨在学习包含多个节点（例如一条链接）的集合的表示。现有GNN主要针对单节点表示设计。当需要学习涉及多个节点的集合表示时，现有工作通常直接聚合由GNN获得的单节点表示。本文揭示了这种方法的根本局限性，即无法捕捉节点集合中多个节点之间的依赖关系，并论证了直接聚合单个节点表示无法为多节点生成有效的联合表示。一个直接的解决方案是将目标节点与其他节点区分开。基于这一思想，我们提出了**标记技巧**，该技巧首先根据节点与目标节点集合的关系对图中的节点进行标记，随后在标记后的图上应用GNN，最后聚合节点表示以获得多节点表示。标记技巧还统一了先前若干成功的多节点表示学习方法，包括SEAL、距离编码、ID-GNN和NBFNet。除图中的节点集外，我们还将标记技巧扩展到偏序集、子集和超图中。实验证明，标记技巧技术能够提升GNN在多种任务上的性能，包括无向链接预测、有向链接预测、超边预测和子图预测。本研究解释了此前基于节点标记的方法的卓越性能，并为使用GNN进行多节点表示学习奠定了理论基础。