This paper considers the problem of community detection on multiple potentially correlated graphs from an information-theoretical perspective. We first put forth a random graph model, called the multi-view stochastic block model (MVSBM), designed to generate correlated graphs on the same set of nodes (with cardinality $n$). The $n$ nodes are partitioned into two disjoint communities of equal size. The presence or absence of edges in the graphs for each pair of nodes depends on whether the two nodes belong to the same community or not. The objective for the learner is to recover the hidden communities with observed graphs. Our technical contributions are two-fold: (i) We establish an information-theoretic upper bound (Theorem~1) showing that exact recovery of community is achievable when the model parameters of MVSBM exceed a certain threshold. (ii) Conversely, we derive an information-theoretic lower bound (Theorem~2) showing that when the model parameters of MVSBM fall below the aforementioned threshold, then for any estimator, the expected number of misclassified nodes will always be greater than one. Our results for the MVSBM recover several prior results for community detection in the standard SBM as well as in multiple independent SBMs as special cases.
翻译:本文从信息论角度研究多个潜在相关图上的社区检测问题。我们首先提出一种随机图模型——多视图随机块模型(MVSBM),旨在同一节点集(基数为$n$)上生成相关图。这$n$个节点被分为两个规模相等的互不相交社区。每对节点之间在图中的边存在与否取决于这两个节点是否属于同一社区。学习者的目标是通过观测到的图恢复隐藏的社区。我们的技术贡献体现在两个方面:(i)建立信息论上界(定理1),证明当MVSBM模型参数超过特定阈值时,可实现社区精确恢复;(ii)推导信息论下界(定理2),证明当MVSBM模型参数低于前述阈值时,对于任意估计器,被错误分类的节点期望数始终大于1。针对MVSBM的结论将标准SBM以及多个独立SBM中社区检测的多项先前结果作为特例纳入统一框架。