The planted densest subgraph detection problem refers to the task of testing whether in a given (random) graph there is a subgraph that is unusually dense. Specifically, we observe an undirected and unweighted graph on $n$ nodes. Under the null hypothesis, the graph is a realization of an Erd\H{o}s-R\'{e}nyi graph with edge probability (or, density) $q$. Under the alternative, there is a subgraph on $k$ vertices with edge probability $p>q$. The statistical as well as the computational barriers of this problem are well-understood for a wide range of the edge parameters $p$ and $q$. In this paper, we consider a natural variant of the above problem, where one can only observe a small part of the graph using adaptive edge queries. For this model, we determine the number of queries necessary and sufficient for detecting the presence of the planted subgraph. Specifically, we show that any (possibly randomized) algorithm must make $\mathsf{Q} = \Omega(\frac{n^2}{k^2\chi^4(p||q)}\log^2n)$ adaptive queries (on expectation) to the adjacency matrix of the graph to detect the planted subgraph with probability more than $1/2$, where $\chi^2(p||q)$ is the Chi-Square distance. On the other hand, we devise a quasi-polynomial-time algorithm that detects the planted subgraph with high probability by making $\mathsf{Q} = O(\frac{n^2}{k^2\chi^4(p||q)}\log^2n)$ non-adaptive queries. We then propose a polynomial-time algorithm which is able to detect the planted subgraph using $\mathsf{Q} = O(\frac{n^3}{k^3\chi^2(p||q)}\log^3 n)$ queries. We conjecture that in the leftover regime, where $\frac{n^2}{k^2}\ll\mathsf{Q}\ll \frac{n^3}{k^3}$, no polynomial-time algorithms exist. Our results resolve two questions posed in \cite{racz2020finding}, where the special case of adaptive detection and recovery of a planted clique was considered.
翻译:植入最密子图检测问题是指检验一个给定的(随机)图中是否存在异常密集的子图的任务。具体而言,我们观察一个包含$n$个节点的无向无权图。在零假设下,该图是边概率(即密度)为$q$的 Erdős–Rényi 图的一个实现。在备择假设下,存在一个包含$k$个顶点的子图,其边概率为$p>q$。该问题的统计障碍与计算障碍在边参数$p$和$q$的广泛范围内已得到充分理解。本文考虑上述问题的一个自然变体,其中我们只能通过自适应边查询观测图的一小部分。对于该模型,我们确定了检测植入子图存在性所需且充分的查询次数。具体而言,我们证明:任何(可能随机的)算法必须对图的邻接矩阵进行$\mathsf{Q} = \Omega(\frac{n^2}{k^2\chi^4(p||q)}\log^2n)$次自适应查询(期望值),才能以高于$1/2$的概率检测到植入子图,其中$\chi^2(p||q)$为卡方距离。另一方面,我们设计了一个拟多项式时间算法,通过$\mathsf{Q} = O(\frac{n^2}{k^2\chi^4(p||q)}\log^2n)$次非自适应查询即可高概率检测到植入子图。随后,我们提出一个多项式时间算法,该算法使用$\mathsf{Q} = O(\frac{n^3}{k^3\chi^2(p||q)}\log^3 n)$次查询即可检测到植入子图。我们推测,在剩余区间(即$\frac{n^2}{k^2}\ll\mathsf{Q}\ll \frac{n^3}{k^3}$)内,不存在多项式时间算法。我们的结果解答了文献 \cite{racz2020finding} 中提出的两个问题,该文献考虑了植入团的自适应检测与恢复这一特例。