We consider minimizing a twice-differentiable, $L$-smooth, and $\mu$-strongly convex objective $\phi$ over an $n\times n$ positive semidefinite matrix $M\succeq0$, under the assumption that the minimizer $M^{\star}$ has low rank $r^{\star}\ll n$. Following the Burer--Monteiro approach, we instead minimize the nonconvex objective $f(X)=\phi(XX^{T})$ over a factor matrix $X$ of size $n\times r$. This substantially reduces the number of variables from $O(n^{2})$ to as few as $O(n)$ and also enforces positive semidefiniteness for free, but at the cost of giving up the convexity of the original problem. In this paper, we prove that if the search rank $r\ge r^{\star}$ is overparameterized by a \emph{constant factor} with respect to the true rank $r^{\star}$, namely as in $r>\frac{1}{4}(L/\mu-1)^{2}r^{\star}$, then despite nonconvexity, local optimization is guaranteed to globally converge from any initial point to the global optimum. This significantly improves upon a previous rank overparameterization threshold of $r\ge n$, which we show is sharp in the absence of smoothness and strong convexity, but would increase the number of variables back up to $O(n^{2})$. Conversely, without rank overparameterization, we prove that such a global guarantee is possible if and only if $\phi$ is almost perfectly conditioned, with a condition number of $L/\mu<3$. Therefore, we conclude that a small amount of overparameterization can lead to large improvements in theoretical guarantees for the nonconvex Burer--Monteiro factorization.
翻译:我们考虑在 $n\times n$ 半正定矩阵 $M\succeq0$ 上最小化一个二次可微、$L$-光滑且 $\mu$-强凸的目标函数 $\phi$,并假设最小化子 $M^{\star}$ 具有低秩 $r^{\star}\ll n$。遵循布勒-蒙特罗方法,我们转而在一个大小为 $n\times r$ 的因子矩阵 $X$ 上最小化非凸目标 $f(X)=\phi(XX^{T})$。这将变量数量从 $O(n^{2})$ 大幅减少至 $O(n)$ 量级,并且自动保证了半正定性,但代价是放弃了原始问题的凸性。本文证明,如果搜索秩 $r\ge r^{\star}$ 相对于真实秩 $r^{\star}$ 存在一个常数因子的过参数化,即满足 $r>\frac{1}{4}(L/\mu-1)^{2}r^{\star}$,那么尽管问题非凸,从任意初始点出发的局部优化都能保证全局收敛到全局最优解。这显著改进了先前 $r\ge n$ 的秩过参数化阈值——我们证明该阈值在缺乏光滑性和强凸性时是紧的,但会将变量数量重新增加至 $O(n^{2})$。反之,在没有秩过参数化的情况下,我们证明只有当 $\phi$ 几乎完美条件化(条件数 $L/\mu<3$)时,此类全局保证才可能成立。因此,我们得出结论:小幅度的过参数化能够显著改善非凸布勒-蒙特罗分解的理论保证。