We study the square root bottleneck in the recovery of sparse vectors from quadratic equations. It is acknowledged that a sparse vector $ \mathbf x_0\in \mathbb{R}^n$, $\| \mathbf x_0\|_0 = k$ can in theory be recovered from as few as $O(k)$ generic quadratic equations but no polynomial time algorithm is known for this task unless $m = \Omega(k^2)$. This bottleneck was in fact shown in previous work to be essentially related to the initialization of descent algorithms. Starting such algorithms sufficiently close to the planted signal is known to imply convergence to this signal. In this paper, we show that as soon as $m\gtrsim \mu_0^{-2}k \vee \mu_0^{-4}$ (up to log factors) where $\mu_0 = \| \mathbf x_0\|_\infty/\| \mathbf x_0\|_2$, it is possible to recover a $k$-sparse vector $ \mathbf x_0\in \mathbb{R}^n$ from $m$ quadratic equations of the form $\langle \mathbf A_i, \mathbf x \mathbf x^\intercal\rangle = \langle \mathbf A_i, \mathbf x_0 \mathbf x_0^\intercal\rangle + \varepsilon_i $ by minimizing the classical empirical loss. The proof idea carries over to the phase retrieval setting for which it provides an original initialization that matches the current optimal sample complexity (see e.g. [Cai 2023]). In the maximally incoherent regime $\mu_0^{-2}=k$, and for $m=o(k^2)$ we provide evidence for topological hardness by showing that a property known as the Overlap Gap Property (OGP), which originated in spin glass theory and is conjectured to be indicative of algorithmic intractability when optimizing over random structures, holds for a particular level of overparametrization. The key ingredient of the proof is a lower bound on the tail of chi-squared random variables which follows from the theory of moderate deviations.
翻译:我们研究了从二次方程中恢复稀疏向量时存在的平方根瓶颈问题。已知理论上可以从$O(k)$个一般二次方程中恢复稀疏向量$\mathbf x_0\in \mathbb{R}^n$(其中$\|\mathbf x_0\|_0 = k$),但除非$m = \Omega(k^2)$,目前尚无多项式时间算法能完成此任务。先前研究已证明该瓶颈本质上与下降算法的初始化有关:若算法起始点足够接近植入信号,则能保证收敛至该信号。本文证明,当$m\gtrsim \mu_0^{-2}k \vee \mu_0^{-4}$(忽略对数因子)且$\mu_0 = \|\mathbf x_0\|_\infty/\|\mathbf x_0\|_2$时,可通过最小化经典经验损失从$m$个形如$\langle \mathbf A_i, \mathbf x \mathbf x^\intercal\rangle = \langle \mathbf A_i, \mathbf x_0 \mathbf x_0^\intercal\rangle + \varepsilon_i$的二次方程中恢复$k$-稀疏向量$\mathbf x_0\in \mathbb{R}^n$。该证明思想可推广至相位恢复场景,为此提供了与当前最优样本复杂度匹配的新型初始化方法(参见[Cai 2023])。在最大非相干体系$\mu_0^{-2}=k$且$m=o(k^2)$条件下,我们通过证明"重叠间隙性质"(OGP)在特定过参数化水平下成立,为拓扑困难性提供了证据。OGP源于自旋玻璃理论,被推测为随机结构优化中算法难解性的表征。证明的关键要素是基于中偏差理论的卡方随机变量尾部下界估计。