Link prediction on graphs is a fundamental problem. Subgraph representation learning approaches (SGRLs), by transforming link prediction to graph classification on the subgraphs around the links, have achieved state-of-the-art performance in link prediction. However, SGRLs are computationally expensive, and not scalable to large-scale graphs due to expensive subgraph-level operations. To unlock the scalability of SGRLs, we propose a new class of SGRLs, that we call Scalable Simplified SGRL (S3GRL). Aimed at faster training and inference, S3GRL simplifies the message passing and aggregation operations in each link's subgraph. S3GRL, as a scalability framework, accommodates various subgraph sampling strategies and diffusion operators to emulate computationally-expensive SGRLs. We propose multiple instances of S3GRL and empirically study them on small to large-scale graphs. Our extensive experiments demonstrate that the proposed S3GRL models scale up SGRLs without significant performance compromise (even with considerable gains in some cases), while offering substantially lower computational footprints (e.g., multi-fold inference and training speedup).
翻译:图上的链接预测是一个基础问题。子图表示学习方法(SGRLs)通过将链接预测转化为链接周围子图的图分类任务,在链接预测中取得了最先进的性能。然而,SGRLs计算开销大,且由于昂贵的子图级操作,无法扩展到大规模图。为解决SGRLs的可扩展性问题,我们提出了一类新的SGRLs,称为可扩展简化SGRL(S3GRL)。S3GRL旨在加快训练和推理速度,简化了每个链接子图中的消息传递和聚合操作。作为一个可扩展框架,S3GRL容纳了多种子图采样策略和扩散算子,以模拟计算昂贵的SGRLs。我们提出了多个S3GRL实例,并在从小型到大型的图上进行了实证研究。大量实验表明,所提出的S3GRL模型在显著降低计算开销(例如多倍的推理和训练加速)的同时,实现了SGRLs的可扩展性,且性能无明显下降(在某些情况下甚至有显著提升)。