We consider the task of detecting a hidden bipartite subgraph in a given random graph. This is formulated as a hypothesis testing problem, under the null hypothesis, the graph is a realization of an Erd\H{o}s-R\'{e}nyi random graph over $n$ vertices with edge density $q$. Under the alternative, there exists a planted $k_{\mathsf{R}} \times k_{\mathsf{L}}$ bipartite subgraph with edge density $p>q$. We characterize the statistical and computational barriers for this problem. Specifically, we derive information-theoretic lower bounds, and design and analyze optimal algorithms matching those bounds, in both the dense regime, where $p,q = \Theta\left(1\right)$, and the sparse regime where $p,q = \Theta\left(n^{-\alpha}\right), \alpha \in \left(0,2\right]$. We also consider the problem of testing in polynomial-time. As is customary in similar structured high-dimensional problems, our model undergoes an "easy-hard-impossible" phase transition and computational constraints penalize the statistical performance. To provide an evidence for this statistical computational gap, we prove computational lower bounds based on the low-degree conjecture, and show that the class of low-degree polynomials algorithms fail in the conjecturally hard region.
翻译:我们考虑在给定随机图中检测隐藏二分子图的任务。该问题被形式化为假设检验问题:在原假设下,图是在$n$个顶点上、边密度为$q$的 Erdős–Rényi 随机图的一个实现;在备择假设下,存在一个植入的$k_{\mathsf{R}} \times k_{\mathsf{L}}$二分子图,其边密度$p>q$。我们刻画了该问题的统计与计算障碍。具体而言,在$p,q = \Theta\left(1\right)$的密集情形以及$p,q = \Theta\left(n^{-\alpha}\right), \alpha \in \left(0,2\right]$的稀疏情形下,我们推导了信息论下界,并设计并分析了匹配这些界限的最优算法。我们还考虑了多项式时间内的检验问题。与类似的高维结构性问题惯例一致,我们的模型经历“简单-困难-不可能”的相变,且计算约束会惩罚统计性能。为证明这一统计计算差距,我们基于低度猜想推导了计算下界,并表明低度多项式算法在猜想的困难区域内失效。