A parameterised model for link prediction using node centrality and similarity measure based on graph embedding

Link prediction is a key aspect of graph machine learning, with applications as diverse as disease prediction, social network recommendations, and drug discovery. It involves predicting new links that may form between network nodes. Despite the clear importance of link prediction, existing models have significant shortcomings. Graph Convolutional Networks, for instance, have been proven to be highly efficient for link prediction on a variety of datasets. However, they encounter severe limitations when applied to short-path networks and ego networks, resulting in poor performance. This presents a critical problem space that this work aims to address. In this paper, we present the Node Centrality and Similarity Based Parameterised Model (NCSM), a novel method for link prediction tasks. NCSM uniquely integrates node centrality and similarity measures as edge features in a customised Graph Neural Network (GNN) layer, effectively leveraging the topological information of large networks. This model represents the first parameterised GNN-based link prediction model that considers topological information. The proposed model was evaluated on five benchmark graph datasets, each comprising thousands of nodes and edges. Experimental results highlight NCSM's superiority over existing state-of-the-art models like Graph Convolutional Networks and Variational Graph Autoencoder, as it outperforms them across various metrics and datasets. This exceptional performance can be attributed to NCSM's innovative integration of node centrality, similarity measures, and its efficient use of topological information.

翻译：链路预测是图机器学习的关键环节，广泛应用于疾病预测、社交网络推荐和药物发现等领域，旨在预测网络节点间可能形成的新连接。尽管链路预测具有明确的重要性，现有模型仍存在显著缺陷。例如，图卷积网络虽已被证明在多种数据集上对链路预测具有高效性，但在应用于短路径网络和自网络时遭遇严重限制，导致性能不佳。这正是本文拟解决的关键问题。本文提出节点中心性与相似性基参数化模型（NCSM），一种面向链路预测任务的新型方法。NCSM在定制化图神经网络层中，创新性地将节点中心性和相似性度量整合为边特征，有效利用了大规模网络的拓扑信息。该模型是首个考虑拓扑信息的参数化GNN链路预测模型。我们在五个包含数千节点与边的基准图数据集上对模型进行了评估。实验结果表明，NCSM在多个指标和数据集上均优于现有最先进模型（如图卷积网络和变分图自编码器），其卓越性能归因于对节点中心性、相似性度量的创新整合以及对拓扑信息的高效利用。

相关内容

链路预测

关注 14

网络中的链路预测(Link Prediction)是指如何通过已知的网络节点以及网络结构等信息预测网络中尚未产生连边的两个节点之间产生链接的可能性。这种预测既包含了对未知链接（exist yet unknown links）的预测也包含了对未来链接（future links）的预测。该问题的研究在理论和应用两个方面都具有重要的意义和价值。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

异质图嵌入综述: 方法、技术、应用和资源

专知会员服务

48+阅读 · 2020年12月13日

KDD20 | 面向时态交互网络的数据驱动图生成模型

专知会员服务

24+阅读 · 2020年9月25日