In this work, we focus on the Bipartite Stochastic Block Model (BiSBM), a popular model for bipartite graphs with a community structure. We consider the high dimensional setting where the number $n_1$ of type I nodes is far smaller than the number $n_2$ of type II nodes. The recent work of Braun and Tyagi (2022) established a sufficient and necessary condition on the sparsity level $p_{max}$ of the bipartite graph to be able to recover the latent partition of type I nodes. They proposed an iterative method that extends the one proposed by Ndaoud et al. (2022) to achieve this goal. Their method requires a good enough initialization, usually obtained by a spectral method, but empirical results showed that the refinement algorithm doesn't improve much the performance of the spectral method. This suggests that the spectral achieves exact recovery in the same regime as the refinement method. We show that it is indeed the case by providing new entrywise bounds on the eigenvectors of the similarity matrix used by the spectral method. Our analysis extend the framework of Lei (2019) that only applies to symmetric matrices with limited dependencies. As an important technical step, we also derive an improved concentration inequality for similarity matrices.
翻译:本文聚焦于二分随机块模型(BiSBM),该模型是研究具有社区结构的二分图的经典模型。我们考虑高维情形,其中第一类节点数$n_1$远小于第二类节点数$n_2$。Braun与Tyagi(2022)的最新工作建立了二分图稀疏水平$p_{max} $的可恢复充分必要条件,以实现第一类节点的隐划分恢复。他们提出了一种迭代方法,该方法扩展了Ndaoud等人(2022)提出的方案,旨在达成该目标。该方法需要良好的初始化,通常通过谱方法获得,但实验结果表明,精化算法并未显著提升谱方法的性能。这暗示谱方法在精化方法的相同机制下实现了精确恢复。我们通过提供谱方法所用相似矩阵特征向量的新入口界,证明了这一结论。我们的分析扩展了Lei(2019)仅适用于弱依赖性对称矩阵的框架。作为关键技术步骤,我们还推导了相似矩阵的改进浓度不等式。