Given an observation $\mathbf Y \in \mathbb{R}^{d_1\times d_2}$ from the model $\mathbf Y = \mathbf X + \mathbf E$ where $\mathbf X$ is constant and $\mathbf E$ has i.i.d. $N(0,1)$ entries, we consider the problem of detecting a planted submatrix in the mean matrix $\mathbf X$. Specifically, we aim to distinguish the null hypothesis $\mathbf X = 0$ from the alternative hypothesis in which $\mathbf X$ is non-zero only on a submatrix of size $s_1 \times s_2$ with elevated entries bounded below by $μ>0$. We establish a minimax lower bound characterizing how large $μ$ must be to ensure that the two hypotheses are distinguishable with high probability. Furthermore, we derive novel minimax-optimal tests achieving the lower bound, and describe extensions of these tests that are adaptive to unknown sparsity levels $s_1$ and $s_2$. In contrast with previous work, which required restrictive assumptions on $s_1,s_2, d_1$ and $d_2$, our non-asymptotic upper and lower bounds match for any configuration of these parameters.
翻译:考虑模型 $\mathbf Y = \mathbf X + \mathbf E$ 给出的观测数据 $\mathbf Y \in \mathbb{R}^{d_1\times d_2}$,其中$\mathbf X$为常数矩阵,$\mathbf E$ 的条目独立同分布于 $N(0,1)$。我们研究均值矩阵 $\mathbf X$ 中植入子矩阵的检测问题。具体而言,我们旨在区分零假设 $\mathbf X = 0$ 与备择假设——即 $\mathbf X$ 仅在一个大小为 $s_1 \times s_2$ 的子矩阵上非零,且该子矩阵的增强元素下界为 $\mu>0$。我们建立了极小化极大下界,刻画了为确保两个假设能以高概率区分所需的最小 $\mu$ 值。进一步,我们推导出达到该下界的新型极小化极大最优检验,并描述了这些检验的扩展形式——它们能够适应未知稀疏水平 $s_1$ 和 $s_2$。与先前要求对 $s_1, s_2, d_1$ 和 $d_2$ 施加限制性假设的工作不同,我们的非渐近上界与下界在这些参数的任意配置下均完全匹配。