Major progress has been made in the previous decade to characterize the asymptotic behavior of regularized M-estimators in high-dimensional regression problems in the proportional asymptotic regime where the sample size $n$ and the number of features $p$ are increasing simultaneously such that $n/p\to \delta \in(0,\infty)$, using powerful tools such as Approximate Message Passing or the Convex Gaussian Min-Max Theorem (CGMT). The asymptotic error and behavior of the regularized M-estimator is then typically described by a system of nonlinear equations with a few scalar unknowns, and the solution to this system precisely characterize the asymptotic error. Application of the CGMT and related machinery requires the existence of a solution to this low-dimensional system of equations. This paper resolves the question of existence of solution to this low-dimensional system for the case of linear models with independent additive noise, when both the data-fitting loss function and regularization penalty are separable and convex. Such existence result for solution to the nonlinear system were previously known under strong convexity for specific estimators such as the Lasso. The main idea behind this existence result is inspired by an argument developed \cite{montanari2019generalization,celentano2020lasso} in different contexts: By constructing an ad-hoc convex minimization problem in an infinite dimensional Hilbert space, the existence of the Lagrange multiplier for this optimization problem makes it possible to construct explicitly solutions to the low-dimensional system of interest. The conditions under which we derive this existence result exactly correspond to the side of the phase transition where perfect recovery $\hat x= x_0$ fails, so that these conditions are optimal.
翻译:近十年来,在高维回归问题中,当样本量$n$和特征数量$p$同时增大且满足$n/p\to \delta \in(0,\infty)$的比例渐近框架下,利用近似消息传递或凸高斯极小极大定理(CGMT)等强大工具,在表征正则化M估计量的渐近行为方面取得了重大进展。此时,正则化M估计量的渐近误差和行为通常由含少量标量未知数的非线性方程组描述,该方程组的解精确刻画了渐近误差。CGMT及相关机制的应用要求该低维方程组存在解。本文针对具有独立加性噪声的线性模型,当数据拟合损失函数和正则化惩罚项均为可分离凸函数时,解决了该低维方程组解的存在性问题。此前,对于Lasso等特定估计量,在强凸性条件下已知此类非线性方程组的解存在性。本存在性结果的核心思想受不同背景下提出的论证启发(参见\cite{montanari2019generalization,celentano2020lasso}):通过在无穷维希尔伯特空间中构造一个特定的凸极小化问题,该优化问题的拉格朗日乘子的存在性使我们能够显式构造出所关注低维方程组的解。我们推导该存在性结果的条件恰好对应完美恢复$\hat x= x_0$失败的相变一侧,因此这些条件是最优的。