A Zero-Inflated Poisson Latent Position Cluster Model

The latent position network model (LPM) is a popular approach for the statistical analysis of network data. A central aspect of this model is that it assigns nodes to random positions in a latent space, such that the probability of an interaction between each pair of individuals or nodes is determined by their distance in this latent space. A key feature of this model is that it allows one to visualize nuanced structures via the latent space representation. The LPM can be further extended to the Latent Position Cluster Model (LPCM), to accommodate the clustering of nodes by assuming that the latent positions are distributed following a finite mixture distribution. In this paper, we extend the LPCM to accommodate missing network data and apply this to non-negative discrete weighted social networks. By treating missing data as ``unusual'' zero interactions, we propose a combination of the LPCM with the zero-inflated Poisson distribution. Statistical inference is based on a novel partially collapsed Markov chain Monte Carlo algorithm, where a Mixture-of-Finite-Mixtures (MFM) model is adopted to automatically determine the number of clusters and optimal group partitioning. Our algorithm features a truncated absorb-eject move, which is a novel adaptation of an idea commonly used in collapsed samplers, within the context of MFMs. Another aspect of our work is that we illustrate our results on 3-dimensional latent spaces, maintaining clear visualizations while achieving more flexibility than 2-dimensional models. The performance of this approach is illustrated via three carefully designed simulation studies, as well as four different publicly available real networks, where some interesting new perspectives are uncovered.

翻译：潜在位置网络模型（LPM）是网络数据统计分析的一种常用方法。该模型的核心在于为节点分配潜在空间中的随机位置，使得每对个体或节点之间发生交互的概率由其在该潜在空间中的距离决定。该模型的一个关键特性是允许通过潜在空间表示来可视化细微的结构。LPM可进一步扩展为潜在位置聚类模型（LPCM），通过假设潜在位置服从有限混合分布来适应节点的聚类。本文中，我们扩展LPCM以处理缺失网络数据，并将其应用于非负离散加权社交网络。通过将缺失数据视为“异常”的零交互，我们提出了LPCM与零膨胀泊松分布的结合。统计推断基于一种新颖的部分折叠马尔可夫链蒙特卡洛算法，其中采用有限混合的混合（MFM）模型来自动确定聚类数量和最优分组划分。我们的算法采用了截断吸收-弹出移动，这是对折叠采样器中常用思想在MFM背景下的新颖改编。我们工作的另一个方面是在三维潜在空间中展示结果，在保持清晰可视化的同时，比二维模型获得更大的灵活性。通过三项精心设计的模拟研究以及四个不同的公开真实网络，展示了该方法的性能，并揭示了一些有趣的新视角。