Link prediction is central to many real-world applications, but its performance may be hampered when the graph of interest is sparse. To alleviate issues caused by sparsity, we investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph. The denser graph may share nodes with the original graph, which offers a natural bridge for transferring meaningful knowledge. We identify this setting as Graph Intersection-induced Transfer Learning (GITL), which is motivated by practical applications in e-commerce or academic co-authorship predictions. We develop a framework to effectively leverage the structural prior in this setting. We first create an intersection subgraph using the shared nodes between the two graphs, then transfer knowledge from the source-enriched intersection subgraph to the full target graph. In the second step, we consider two approaches: a modified label propagation, and a multi-layer perceptron (MLP) model in a teacher-student regime. Experimental results on proprietary e-commerce datasets and open-source citation graphs show that the proposed workflow outperforms existing transfer learning baselines that do not explicitly utilize the intersection structure.
翻译:链接预测是许多实际应用的核心问题,但当目标图稀疏时,其性能可能受到影响。为缓解稀疏性带来的问题,我们研究了一个此前被忽视的现象:在许多情况下,可以为原始图找到一种密集连接的互补图。该密集图可能与原始图共享节点,从而为迁移有意义的知识提供了天然桥梁。我们将此场景定义为图交集诱导的迁移学习(GITL),其动机源自电子商务或学术合著预测中的实际应用。我们开发了一个框架,能够有效利用该场景中的结构先验。首先利用两图之间的共享节点构建交集子图,然后将知识从源增强的交集子图迁移至完整的目标图。在第二步中,我们考虑了两种方法:改进的标签传播,以及基于教师-学生机制的多层感知机(MLP)模型。在专有电子商务数据集和开源引文图上的实验结果表明,所提工作流在性能上优于未显式利用交集结构的现有迁移学习基线方法。