Community detection in the hypergraph stochastic block model and reconstruction on hypertrees

We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.

翻译：我们研究了具有两个平衡社区的$r$-一致超图随机块模型（$r$-HSBM）上的弱恢复问题。在该模型中，$n$个顶点被随机划分为两个社区，并根据超边中所有顶点是否属于同一社区来随机添加大小为$r$的超边。弱恢复的目标是在给定超图的条件下恢复社区的非平凡部分。Pal和Zhu（2021）以及Stephan和Zhu（2022）的研究表明，在称为Kesten-Stigum（KS）阈值的自然阈值之上，弱恢复总是可能的。对于同配模型（即偏好单色超边），Gu和Polyanskiy（2023）证明了当$r\le 4$或期望度$d$较小时，KS阈值是紧的。对于其他情况，KS阈值的紧性仍然悬而未决。在本文中，我们针对广泛的参数范围确定了KS阈值的紧性。我们证明，对于$r\le 6$且$d$足够大的情况，KS阈值是紧的。这表明在该参数范围内不存在信息-计算间隙，并部分证实了Angelini等人（2015）的猜想。另一方面，我们证明对于$r\ge 5$，存在某些参数使得KS阈值不紧。特别地，对于$r\ge 7$，如果模型是异配的（即偏好多色超边）或$d$足够大，则KS阈值不紧。这为这些情况下存在信息-计算间隙提供了更多证据支持。此外，我们针对固定$r$和大$d$的情况建立了弱恢复阈值的渐近界。我们还获得了关于超树广播（BOHT）模型的一系列结果，包括$r\ge 7$时重构阈值的渐近性以及在临界点处鲁棒重构的不可能性。