Weighted networks encode not only the presence of interactions but also their strength. Existing methods for weighted network community detection often rely on Poisson models, which can be restrictive for overdispersed data and make efficient posterior computation difficult when covariates are incorporated. We propose Bayesian stochastic block models based on the zero-inflated negative binomial distribution: ZINB-SBM without covariates and CZINB-SBM with pairwise covariates. The proposed models accommodate overdispersion, naturally account for missing interactions through zero inflation, and admit efficient Gibbs sampling. In CZINB-SBM, Pólya-Gamma data augmentation enables posterior inference for regression coefficients with uncertainty quantification. We further employ a dynamic mixture of finite mixtures, which allows the number of communities to be inferred from the data and can lead to more accurate clustering. Simulation studies show that ZINB-SBM is more robust than a zero-inflated Poisson SBM for highly overdispersed networks. Real data analysis demonstrates interpretable block specific covariate effects and substantially improved missing link prediction compared with a Poisson regression-based Bayesian SBM.
翻译:加权网络不仅编码交互的存在性,还编码其强度。现有加权网络社区检测方法通常依赖泊松模型,该模型对于过度分散的数据具有局限性,且在纳入协变量时难以实现高效的后验计算。我们提出基于零膨胀负二项分布的贝叶斯随机块模型:不含协变量的ZINB-SBM和含成对协变量的CZINB-SBM。所提模型能够适应过度分散现象,通过零膨胀机制自然处理缺失交互,并支持高效吉布斯采样。在CZINB-SBM中,Pólya-Gamma数据增广技术可实现回归系数后验推断并量化不确定性。我们进一步采用动态有限混合模型,允许从数据中推断社区数量,从而获得更精准的聚类结果。模拟研究表明,对于高度过度分散的网络,ZINB-SBM比零膨胀泊松SBM更具鲁棒性。真实数据分析表明,与基于泊松回归的贝叶斯SBM相比,所提模型能呈现可解释的区块特异性协变量效应,并显著提升缺失链接预测性能。