This article investigates and compares three approaches to link prediction in colaboration networks, namely, an ERGM (Exponential Random Graph Model; Robins et al. 2007), a GCN (Graph Convolutional Network; Kipf and Welling 2017), and a Word2Vec+MLP model (Word2Vec model combined with a multilayer neural network; Mikolov et al. 2013a and Goodfellow et al. 2016). The ERGM, grounded in statistical methods, is employed to capture general structural patterns within the network, while the GCN and Word2Vec+MLP models leverage deep learning techniques to learn adaptive structural representations of nodes and their relationships. The predictive performance of the models is assessed through extensive simulation exercises using cross-validation, with metrics based on the receiver operating characteristic curve. The results clearly show the superiority of machine learning approaches in link prediction, particularly in large networks, where traditional models such as ERGM exhibit limitations in scalability and the ability to capture inherent complexities. These findings highlight the potential benefits of integrating statistical modeling techniques with deep learning methods to analyze complex networks, providing a more robust and effective framework for future research in this field.
翻译:本文研究并比较了合作网络中链路预测的三种方法:ERGM(指数随机图模型;Robins等人,2007年)、GCN(图卷积网络;Kipf与Welling,2017年)以及Word2Vec+MLP模型(Word2Vec模型结合多层神经网络;Mikolov等人,2013a及Goodfellow等人,2016年)。基于统计方法的ERGM用于捕捉网络内的通用结构模式,而GCN和Word2Vec+MLP模型则利用深度学习技术来学习节点及其关系的自适应结构表示。通过使用交叉验证进行广泛的模拟实验,并基于接收者操作特征曲线的指标,评估了这些模型的预测性能。结果清晰地表明,机器学习方法在链路预测中具有优越性,尤其是在大型网络中,传统模型如ERGM在可扩展性和捕捉内在复杂性方面存在局限。这些发现凸显了将统计建模技术与深度学习方法相结合以分析复杂网络的潜在优势,为该领域未来的研究提供了一个更稳健、更有效的框架。