Detecting communities in networks and graphs is an important task across many disciplines such as statistics, social science and engineering. There are generally three different kinds of mixing patterns for the case of two communities: assortative mixing, disassortative mixing and core-periphery structure. Modularity optimization is a classical way for fitting network models with communities. However, it can only deal with assortative mixing and disassortative mixing when the mixing pattern is known and fails to discover the core-periphery structure. In this paper, we extend modularity in a strategic way and propose a new framework based on Unified Bigroups Standadized Edge-count Analysis (UBSea). It can address all the formerly mentioned community mixing structures. In addition, this new framework is able to automatically choose the mixing type to fit the networks. Simulation studies show that the new framework has superb performance in a wide range of settings under the stochastic block model and the degree-corrected stochastic block model. We show that the new approach produces consistent estimate of the communities under a suitable signal-to-noise-ratio condition, for the case of a block model with two communities, for both undirected and directed networks. The new method is illustrated through applications to several real-world datasets.
翻译:检测网络和图中的社区是统计学、社会科学和工程学等多个学科的重要任务。对于两个社区的情况,通常存在三种不同的混合模式:同配混合、异配混合以及核心-边缘结构。模块度优化是拟合带社区网络模型的经典方法。然而,它只能在已知混合模式时处理同配混合和异配混合,而无法发现核心-边缘结构。本文以策略性方式扩展了模块度,并基于统一双组标准化边计数分析(UBSea)提出了一种新框架。该框架能够处理前述所有社区混合结构。此外,这一新框架能够自动选择适合网络的混合类型。模拟研究表明,该新框架在随机块模型和度修正随机块模型的多种设置下均表现出优异性能。我们证明,在适当的信噪比条件下,对于具有两个社区(含无向和有向网络)的块模型,该方法能给出社区的一致估计。通过多个真实世界数据集的应用,该方法得到了验证。