In this work, we focus on the Bipartite Stochastic Block Model (BiSBM), a popular model for bipartite graphs with a community structure. We consider the high dimensional setting where the number $n_1$ of type I nodes is far smaller than the number $n_2$ of type II nodes. The recent work of Braun and Tyagi (2022) established a sufficient and necessary condition on the sparsity level $p_{max}$ of the bipartite graph to be able to recover the latent partition of type I nodes. They proposed an iterative method that extends the one proposed by Ndaoud et al. (2022) to achieve this goal. Their method requires a good enough initialization, usually obtained by a spectral method, but empirical results showed that the refinement algorithm doesn't improve much the performance of the spectral method. This suggests that the spectral achieves exact recovery in the same regime as the refinement method. We show that it is indeed the case by providing new entrywise bounds on the eigenvectors of the similarity matrix used by the spectral method. Our analysis extend the framework of Lei (2019) that only applies to symmetric matrices with limited dependencies. As an important technical step, we also derive an improved concentration inequality for similarity matrices.
翻译:本文聚焦于二分随机块模型(BiSBM),该模型是描述具有社区结构的二分图的经典模型。我们考虑高维场景:其中第I类节点数n1远小于第II类节点数n2。Braun与Tyagi(2022)近期工作建立了二分图稀疏水平p_max的充分必要条件,以实现对第I类节点隐含划分的恢复。他们提出了一种迭代方法,该方法是对Ndaoud等人(2022)所提方案的扩展。该迭代方法需要良好的初始化(通常通过谱方法获得),但实验结果表明,优化算法并未显著提升谱方法的性能。这暗示谱方法能在与优化方法相同的条件下实现精确恢复。通过为谱方法所用相似矩阵的特征向量提供新的逐元素边界,我们证实了该推论。我们的分析推广了Lei(2019)仅适用于有限依赖对称矩阵的理论框架。作为关键技术步骤,我们还推导了相似矩阵的改进集中不等式。