Despite its outstanding performance in various graph tasks, vanilla Message Passing Neural Network (MPNN) usually fails in link prediction tasks, as it only uses representations of two individual target nodes and ignores the pairwise relation between them. To capture the pairwise relations, some models add manual features to the input graph and use the output of MPNN to produce pairwise representations. In contrast, others directly use manual features as pairwise representations. Though this simplification avoids applying a GNN to each link individually and thus improves scalability, these models still have much room for performance improvement due to the hand-crafted and unlearnable pairwise features. To upgrade performance while maintaining scalability, we propose Neural Common Neighbor (NCN), which uses learnable pairwise representations. To further boost NCN, we study the unobserved link problem. The incompleteness of the graph is ubiquitous and leads to distribution shifts between the training and test set, loss of common neighbor information, and performance degradation of models. Therefore, we propose two intervention methods: common neighbor completion and target link removal. Combining the two methods with NCN, we propose Neural Common Neighbor with Completion (NCNC). NCN and NCNC outperform recent strong baselines by large margins. NCNC achieves state-of-the-art performance in link prediction tasks.
翻译:尽管在各类图任务中表现优异,标准消息传递神经网络(MPNN)在链接预测任务中常常失效,因其仅利用两个目标节点的个体表示而忽略它们之间的成对关系。为捕获这种成对关系,部分模型在输入图中添加手工特征,并利用MPNN的输出生成成对表示。另一些模型则直接使用手工特征作为成对表示。尽管这种简化避免了为每条链接单独应用图神经网络,从而提升可扩展性,但由于采用手工设计且不可学习的成对特征,这些模型的性能仍有较大提升空间。为兼顾性能与可扩展性,我们提出神经共同邻居(NCN),该方法采用可学习的成对表示。为进一步增强NCN,我们研究了未观测链接问题。图的不完整性普遍存在,会导致训练集与测试集之间的分布偏移、共同邻居信息丢失以及模型性能下降。因此,我们提出两种干预方法:共同邻居补全和目标链接移除。将这两种方法与NCN相结合,我们提出带补全的神经共同邻居(NCNC)。NCN和NCNC在性能上大幅超越近期强基线模型,其中NCNC在链接预测任务中达到了当前最优水平。