We study the behavior of a label propagation algorithm (LPA) on the Erd\H{o}s-R\'enyi random graph $\mathcal{G}(n,p)$. Initially, given a network, each vertex starts with a random label in the interval $[0,1]$. Then, in each round of LPA, every vertex switches its label to the majority label in its neighborhood (including its own label). At the first round, ties are broken towards smaller labels, while at each of the next rounds, ties are broken uniformly at random. The algorithm terminates once all labels stay the same in two consecutive iterations. LPA is successfully used in practice for detecting communities in networks (corresponding to vertex sets with the same label after termination of the algorithm). Perhaps surprisingly, LPA's performance on dense random graphs is hard to analyze, and so far convergence to consenus was known only when $np\ge n^{3/4+\varepsilon}$. By a very careful multi-stage exposure of the edges, we break this barrier and show that, when $np \ge n^{5/8+\varepsilon}$, a.a.s. the algorithm terminates with a single label. Moreover, we show that, if $np\gg n^{2/3}$, a.a.s. this label is the smallest one, whereas if $n^{5/8+\varepsilon}\le np\ll n^{2/3}$, the surviving label is a.a.s. not the smallest one.
翻译:我们研究标签传播算法(LPA)在 Erdős–Rényi 随机图 $\mathcal{G}(n,p)$ 上的行为。初始时,给定一个网络,每个顶点被赋予一个位于区间 $[0,1]$ 内的随机标签。然后,在 LPA 的每一轮中,每个顶点将其标签切换为其邻域(包括自身标签)中的多数标签。在第一轮中,平局通过选择较小标签来打破,而在后续每一轮中,平局通过均匀随机选择来打破。当所有标签连续两次迭代保持不变时,算法终止。LPA 在实践中成功用于检测网络中的社区(对应于算法终止后具有相同标签的顶点集合)。令人惊讶的是,LPA 在稠密随机图上的性能难以分析,目前仅在 $np\ge n^{3/4+\varepsilon}$ 时已知其收敛到一致标签。通过非常精细的多阶段边暴露技术,我们突破了这一屏障,证明当 $np \ge n^{5/8+\varepsilon}$ 时,算法几乎必然以一个单一标签终止。此外,我们证明,如果 $np\gg n^{2/3}$,则该标签几乎必然是最小标签;而如果 $n^{5/8+\varepsilon}\le np\ll n^{2/3}$,则存活的标签几乎必然不是最小标签。