Latent position models are widely used for the analysis of networks in a variety of research fields. In fact, these models possess a number of desirable theoretical properties, and are particularly easy to interpret. However, statistical methodologies to fit these models generally incur a computational cost which grows with the square of the number of nodes in the graph. This makes the analysis of large social networks impractical. In this paper, we propose a new method characterised by a much reduced computational complexity, which can be used to fit latent position models on networks of several tens of thousands nodes. Our approach relies on an approximation of the likelihood function, where the amount of noise introduced by the approximation can be arbitrarily reduced at the expense of computational efficiency. We establish several theoretical results that show how the likelihood error propagates to the invariant distribution of the Markov chain Monte Carlo sampler. In particular, we demonstrate that one can achieve a substantial reduction in computing time and still obtain a good estimate of the latent structure. Finally, we propose applications of our method to simulated networks and to a large coauthorships network, highlighting the usefulness of our approach.
翻译:潜在位置模型广泛应用于各类研究领域的网络分析。事实上,这些模型具有若干理想的理论性质,且特别易于解释。然而,拟合这些模型的统计方法通常会产生与图中节点数量的平方成正比的计算成本,这使得大规模社交网络的分析变得不切实际。本文提出一种新方法,其计算复杂度显著降低,可适用于包含数万个节点的网络上的潜在位置模型拟合。该方法基于似然函数的近似,其中近似引入的噪声量可通过牺牲计算效率而任意减小。我们建立了若干理论结果,揭示了似然误差如何传播至马尔可夫链蒙特卡洛采样器的不变分布。特别地,我们证明在显著降低计算时间的同时,仍能获得潜在结构的良好估计。最后,我们将该方法应用于模拟网络及大型合著网络,凸显了其实用价值。