Consider the community detection problem in random hypergraphs under the non-uniform hypergraph stochastic block model (HSBM), where each hyperedge appears independently with some given probability depending only on the labels of its vertices. We establish, for the first time in the literature, a sharp threshold for exact recovery under this non-uniform case, subject to minor constraints; in particular, we consider the model with $K$ classes as well as the symmetric binary model ($K=2$). One crucial point here is that by aggregating information from all the uniform layers, we may obtain exact recovery even in cases when this may appear impossible if each layer were considered alone. Two efficient algorithms that successfully achieve exact recovery above the threshold are provided. The theoretical analysis of our algorithms relies on the concentration and regularization of the adjacency matrix for non-uniform random hypergraphs, which could be of independent interest. We also address some open problems regarding parameter knowledge and estimation.
翻译:考虑在非均匀超图随机块模型(HSBM)下的随机超图社区发现问题,其中每个超边以仅取决于其顶点标签的给定概率独立出现。本文首次在文献中建立了非均匀情形下精确恢复的尖锐阈值(在次要约束条件下);特别地,我们研究了包含$K$个类别的模型以及对称二元模型($K=2$)。关键点在于:通过聚合所有均匀层的信息,即使单独考虑每一层时精确恢复看似不可能,我们仍可实现精确恢复。本文提供了两种在阈值之上成功实现精确恢复的高效算法。算法的理论分析依赖于非均匀随机超图邻接矩阵的集中性与正则化,这一结果本身可能具有独立意义。此外,我们还讨论了关于参数知识与估计的一些未解决问题。