Hypergraphs are widely adopted tools to examine systems with higher-order interactions. Despite recent advancements in methods for community detection in these systems, we still lack a theoretical analysis of their detectability limits. Here, we derive closed-form bounds for community detection in hypergraphs. Using a Message-Passing formulation, we demonstrate that detectability depends on hypergraphs' structural properties, such as the distribution of hyperedge sizes or their assortativity. Our formulation enables a characterization of the entropy of a hypergraph in relation to that of its clique expansion, showing that community detection is enhanced when hyperedges highly overlap on pairs of nodes. We develop an efficient Message-Passing algorithm to learn communities and model parameters on large systems. Additionally, we devise an exact sampling routine to generate synthetic data from our probabilistic model. With these methods, we numerically investigate the boundaries of community detection in synthetic datasets, and extract communities from real systems. Our results extend the understanding of the limits of community detection in hypergraphs and introduce flexible mathematical tools to study systems with higher-order interactions.
翻译:超图是研究具有高阶交互作用的系统的广泛采用工具。尽管近年来在针对这些系统的社区检测方法上取得了进展,但我们仍缺乏对其可检测极限的理论分析。在此,我们推导出超图中社区检测的闭合形式界限。通过消息传递框架,我们证明可检测性取决于超图的结构特性,例如超边大小的分布或它们的同配性。我们的框架能够刻画超图熵与其团扩展熵之间的关系,表明当超边在节点对上高度重叠时,社区检测会得到增强。我们开发了一种高效的消息传递算法,用于在大规模系统中学习社区和模型参数。此外,我们设计了一种精确采样程序,用于从概率模型中生成合成数据。利用这些方法,我们在合成数据集上数值研究了社区检测的边界,并从真实系统中提取社区。我们的研究结果拓展了对超图中社区检测极限的理解,并引入了灵活的数理工具来研究具有高阶交互作用的系统。