Consider the community detection problem in random hypergraphs under the non-uniform hypergraph stochastic block model (HSBM), where each hyperedge appears independently with some given probability depending only on the labels of its vertices. We establish, for the first time in the literature, a sharp threshold for exact recovery under this non-uniform case, subject to minor constraints; in particular, we consider the model with multiple communities ($K \geq 2$). One crucial point here is that by aggregating information from all the uniform layers, we may obtain exact recovery even in cases when this may appear impossible if each layer were considered alone. Two efficient algorithms that successfully achieve exact recovery above the threshold are provided. The theoretical analysis of our algorithms relies on the concentration and regularization of the adjacency matrix for non-uniform random hypergraphs, which could be of independent interest. We also address some open problems regarding parameter knowledge and estimation.
翻译:考虑非均匀超图随机块模型(HSBM)下随机超图中的社区检测问题,其中每条超边以依赖于顶点标签的给定概率独立出现。在文献中,我们首次建立了该非均匀情形下精确恢复的尖锐阈值(在轻微约束条件下);特别地,我们考虑了多社区($K \geq 2$)模型。关键的一点是,通过聚合所有均匀层的信息,即使各层单独考虑时看似无法实现精确恢复,我们仍可能成功恢复。本文提供了两个在阈值以上成功实现精确恢复的高效算法。我们算法的理论分析依赖于非均匀随机超图邻接矩阵的集中性与正则化,这一结果可能具有独立研究价值。此外,我们还探讨了关于参数知识与估计的一些开放性问题。