Existence of solutions to the nonlinear equations characterizing the precise error of M-estimators

Major progress has been made in the previous decade to characterize the asymptotic behavior of regularized M-estimators in high-dimensional regression problems in the proportional asymptotic regime where the sample size $n$ and the number of features $p$ are increasing simultaneously such that $n/p\to \delta \in(0,\infty)$, using powerful tools such as Approximate Message Passing or the Convex Gaussian Min-Max Theorem (CGMT). The asymptotic error and behavior of the regularized M-estimator is then typically described by a system of nonlinear equations with a few scalar unknowns, and the solution to this system precisely characterize the asymptotic error. Application of the CGMT and related machinery requires the existence of a solution to this low-dimensional system of equations. This paper resolves the question of existence of solution to this low-dimensional system for the case of linear models with independent additive noise, when both the data-fitting loss function and regularization penalty are separable and convex. Such existence result for solution to the nonlinear system were previously known under strong convexity for specific estimators such as the Lasso. The main idea behind this existence result is inspired by an argument developed \cite{montanari2019generalization,celentano2020lasso} in different contexts: By constructing an ad-hoc convex minimization problem in an infinite dimensional Hilbert space, the existence of the Lagrange multiplier for this optimization problem makes it possible to construct explicitly solutions to the low-dimensional system of interest. The conditions under which we derive this existence result exactly correspond to the side of the phase transition where perfect recovery $\hat x= x_0$ fails, so that these conditions are optimal.

翻译：过去十年间，在比例渐近框架下（即样本量$n$与特征数$p$同步增长且$n/p\to \delta \in(0,\infty)$），利用近似消息传递或凸高斯极小极大定理（CGMT）等强大工具，高维回归问题中正则化M-估计量的渐近行为研究取得了重要进展。此类正则化M-估计量的渐近误差与行为特征通常由含若干标量未知数的非线性方程组描述，而该方程组的解精确刻画了渐近误差。CGMT及相关理论的应用需要该低维方程组解的存在性。本文解决了独立加性噪声线性模型下，当数据拟合损失函数与正则化惩罚项均为可分离凸函数时，此类低维方程组解的存在性问题。此前，此类非线性方程组解的存在性仅在强凸条件下针对Lasso等特定估计量获得。本文解存在性论证的核心思想受启发于不同背景下的论证方法\cite{montanari2019generalization,celentano2020lasso}：通过在无限维希尔伯特空间中构造特定的凸极小化问题，该优化问题拉格朗日乘子的存在性使得显式构建所求低维方程组的解成为可能。本解存在性定理的成立条件恰好对应于相变边界处完美重构$\hat x= x_0$失效的情形，因此这些条件具有最优性。