Weighted networks encode not only the presence of interactions but also their strength. Existing methods for weighted network community detection often rely on Poisson models, which can be restrictive for overdispersed data and make efficient posterior computation difficult when covariates are incorporated. We propose Bayesian stochastic block models based on the zero-inflated negative binomial distribution: ZINB-SBM without covariates and CZINB-SBM with pairwise covariates. The proposed models accommodate overdispersion, naturally account for missing interactions through zero inflation, and admit efficient Gibbs sampling. In CZINB-SBM, Pólya-Gamma data augmentation enables posterior inference for regression coefficients with uncertainty quantification. We further employ a dynamic mixture of finite mixtures, which allows the number of communities to be inferred from the data and can lead to more accurate clustering. Simulation studies show that ZINB-SBM is more robust than a zero-inflated Poisson SBM for highly overdispersed networks. Real data analysis demonstrates interpretable block specific covariate effects and substantially improved missing link prediction compared with a Poisson regression-based Bayesian SBM.
翻译:加权网络不仅编码了交互的存在性,还编码了交互的强度。现有加权网络社区检测方法通常依赖泊松模型,但对于过度分散数据存在局限性,且在纳入协变量时难以实现高效的后验计算。我们提出基于零膨胀负二项分布的贝叶斯随机块模型:无协变量的ZINB-SBM模型和含配对协变量的CZINB-SBM模型。所提模型能适应过度分散性,通过零膨胀机制自然解释缺失交互,并支持高效吉布斯采样。在CZINB-SBM中,Pólya-Gamma数据增广技术使得回归系数的后验推断具备不确定性量化能力。我们进一步采用有限混合的动态混合模型,能够从数据中推断社区数量,并实现更准确的聚类。模拟研究表明,对于高度过度分散的网络,ZINB-SBM比零膨胀泊松SBM具有更强的鲁棒性。真实数据分析展示了可解释的区块特定协变量效应,与基于泊松回归的贝叶斯SBM相比,缺失链接预测性能显著提升。