Community detection is a classic problem in network science with extensive applications in various fields. Among numerous approaches, the most common method is modularity maximization. Despite their design philosophy and wide adoption, heuristic modularity maximization algorithms rarely return an optimal partition or anything similar. We propose a specialized algorithm, Bayan, which returns partitions with a guarantee of either optimality or proximity to an optimal partition. At the core of the Bayan algorithm is a branch-and-cut scheme that solves an integer programming formulation of the problem to optimality or approximate it within a factor. We demonstrate Bayan's distinctive accuracy and stability over 21 other algorithms in retrieving ground-truth communities in synthetic benchmarks and node labels in real networks. Bayan is several times faster than open-source and commercial solvers for modularity maximization making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Overall, our assessments point to Bayan as a suitable choice for exact maximization of modularity in networks with up to 3000 edges (in their largest connected component) and approximating maximum modularity in larger networks on ordinary computers.
翻译:社区检测是网络科学中的一个经典问题,在多个领域均有广泛应用。在众多方法中,最常用的是模块度最大化。尽管启发式模块度最大化算法有其设计理念且被广泛采用,但极少能返回最优或接近最优的划分结果。我们提出一种专用算法Bayan,该算法能返回具有最优性或接近最优性保证的划分。Bayan算法的核心是分支切割框架,它将问题建模为整数规划并进行精确求解或在一定因子内近似求解。我们通过合成基准网络中的真实社区恢复实验和真实网络中的节点标签匹配实验证明,Bayan在21种对比算法中具有显著更高的准确性和稳定性。在模块度最大化任务上,Bayan的速度比开源和商业求解器快数倍,能够找出其他现有方法无法优化的实例的最优划分。总体评估表明,Bayan适用于网络中最多包含3000条边(在最大连通分量中)的精确模块度最大化,以及普通计算机上更大规模网络的近似最大模块度求解。