The overlap gap property (OGP) is a statement about the geometry of near-optimal solutions. Exhibiting OGP implies failure of a class of local algorithms; and has been observed to coincide with conjectured algorithmic limits in problems with statistical computational gap. We consider the Stochastic Block Model (SBM), where the graph has a planted partition with $k$ equal-size blocks which form the `communities', and where, for parameters $p>q$, vertices within the same community connect with probability $p$, while vertices in different communities connect with probability $q$, independently across pairs of vertices. Modularity--based clustering algorithms have become ubiquitous in applications. This article studies theoretical limits of local algorithms based on the modularity score on the SBM. We establish that modularity exhibits OGP on the SBM. This rules out a class of local algorithms based on modularity for recovery in the SBM, and shows slow mixing time for a related Markov Chain. Theoretically this is one of the few instances where OGP has been established for a `planted' model, as most such analyses to date consider the `null' model. As part of our analysis, we extend a result by Bickel and Chen 2009, who established that with high probability, the modularity optimal partition of SBM is $o(n)$ local moves away from the planted partition, where $n$ is the graph size. We show that, with high probability, any partition with modularity score sufficiently near the optimal value is close to the planted partition.
翻译:重叠间隙性质(OGP)是关于近优解几何结构的一种表述。证明OGP存在即意味着某类局部算法的失效;在存在统计计算差距的问题中,它已被观察到与假设的算法极限相吻合。我们考虑随机块模型(SBM),其中图具有一个包含k个等大小块的分区,这些块构成“社区”,对于参数p>q,同一社区内的顶点以概率p连接,而不同社区的顶点以概率q连接,且各顶点对之间相互独立。基于模块度的聚类算法在应用中已变得无处不在。本文研究基于模块度分数的局部算法在SBM上的理论极限。我们证明模块度在SBM上具有OGP性质。这排除了基于模块度的局部算法在SBM中进行恢复的可行性,并表明相关马尔可夫链具有缓慢的混合时间。从理论上看,这是少数几个在“种植”模型中建立OGP的实例之一,因为迄今为止的大多数此类分析都考虑“零”模型。作为分析的一部分,我们扩展了Bickel和Chen在2009年的结果,该结果表明:以高概率,SBM的最优模块度分区距离种植分区仅差o(n)次局部移动,其中n是图的规模。我们证明:以高概率,任何模块度分数充分接近最优值的分区都接近种植分区。