In this work, we focus on the Bipartite Stochastic Block Model (BiSBM), a popular model for bipartite graphs with a community structure. We consider the high dimensional setting where the number $n_1$ of type I nodes is far smaller than the number $n_2$ of type II nodes. The recent work of Braun and Tyagi (2022) established a sufficient and necessary condition on the sparsity level $p_{max}$ of the bipartite graph to be able to recover the latent partition of type I nodes. They proposed an iterative method that extends the one proposed by Ndaoud et al. (2022) to achieve this goal. Their method requires a good enough initialization, usually obtained by a spectral method, but empirical results showed that the refinement algorithm doesn't improve much the performance of the spectral method. This suggests that the spectral achieves exact recovery in the same regime as the refinement method. We show that it is indeed the case by providing new entrywise bounds on the eigenvectors of the similarity matrix used by the spectral method. Our analysis extend the framework of Lei (2019) that only applies to symmetric matrices with limited dependencies. As an important technical step, we also derive an improved concentration inequality for similarity matrices.
翻译:本研究聚焦于具有社区结构的二分图经典模型——二分随机块模型(BiSBM)。我们考虑高维场景,其中第一类节点数$n_1$远小于第二类节点数$n_2$。Braun与Tyagi (2022)的最新工作建立了二分图稀疏度水平$p_{max}$的充分必要条件,使得第一类节点的潜在划分得以恢复。他们提出了一种迭代方法,该方法扩展了Ndaoud等人(2022)的算法以实现该目标。该方法需要良好的初始化(通常通过谱方法获得),但实证结果表明,精细化算法对谱方法性能的提升有限。这表明谱方法在相同条件下可实现与精细化方法相同的精确恢复。通过提供谱方法所用相似矩阵特征向量的新逐元素界,我们证实了这一结论。我们的分析拓展了仅适用于有限依赖对称矩阵的Lei (2019)框架。作为关键技术步骤,我们还推导了相似矩阵的改进型集中不等式。