With the explosion of graph-structured data, link prediction has emerged as an increasingly important task. Embedding methods for link prediction utilize neural networks to generate node embeddings, which are subsequently employed to predict links between nodes. However, the existing embedding methods typically take a holistic strategy to learn node embeddings and ignore the entanglement of latent factors. As a result, entangled embeddings fail to effectively capture the underlying information and are vulnerable to irrelevant information, leading to unconvincing and uninterpretable link prediction results. To address these challenges, this paper proposes a novel framework with two variants, the disentangled graph auto-encoder (DGAE) and the variational disentangled graph auto-encoder (VDGAE). Our work provides a pioneering effort to apply the disentanglement strategy to link prediction. The proposed framework infers the latent factors that cause edges in the graph and disentangles the representation into multiple channels corresponding to unique latent factors, which contributes to improving the performance of link prediction. To further encourage the embeddings to capture mutually exclusive latent factors, we introduce mutual information regularization to enhance the independence among different channels. Extensive experiments on various real-world benchmarks demonstrate that our proposed methods achieve state-of-the-art results compared to a variety of strong baselines on link prediction tasks. Qualitative analysis on the synthetic dataset also illustrates that the proposed methods can capture distinct latent factors that cause links, providing empirical evidence that our models are able to explain the results of link prediction to some extent. All code will be made publicly available upon publication of the paper.
翻译:随着图结构数据的爆炸式增长,链接预测已成为日益重要的任务。用于链接预测的嵌入方法利用神经网络生成节点嵌入,随后用于预测节点间的链接。然而,现有嵌入方法通常采用整体策略学习节点嵌入,忽略了潜在因素的纠缠性。因此,纠缠嵌入无法有效捕获底层信息,且易受无关信息干扰,导致链接预测结果缺乏说服力和可解释性。为解决这些问题,本文提出了一种包含两种变体的新颖框架:解耦图自编码器(DGAE)和变分解耦图自编码器(VDGAE)。我们的工作是首次将解耦策略应用于链接预测的先驱性尝试。该框架推断图中导致边形成的潜在因素,并将表示解耦为对应不同潜在因素的多个通道,从而有助于提升链接预测性能。为进一步促使嵌入捕获互斥的潜在因素,我们引入互信息正则化以增强不同通道间的独立性。在多个真实世界基准数据集上的大量实验表明,与多种强基线方法相比,所提出的方法在链接预测任务上达到了最先进的结果。在合成数据集上的定性分析也表明,所提方法能够捕获导致链接的不同潜在因素,为模型能在一定程度上解释链接预测结果提供了实证依据。论文发表后,所有代码将公开提供。