Observable adjustments in single-index models for regularized M-estimators

We consider observations $(X,y)$ from single index models with unknown link function, Gaussian covariates and a regularized M-estimator $\hatβ$ constructed from convex loss function and regularizer. In the regime where sample size $n$ and dimension $p$ are both increasing such that $p/n$ has a finite limit, the behavior of the empirical distribution of $\hatβ$ and the predicted values $X\hatβ$ has been previously characterized in a number of models: The empirical distributions are known to converge to proximal operators of the loss and penalty in a related Gaussian sequence model, which captures the interplay between ratio $p/n$, loss, regularization and the data generating process. This connection between$(\hatβ,X\hatβ)$ and the corresponding proximal operators require solving fixed-point equations that typically involve unobservable quantities such as the prior distribution on the index or the link function. This paper develops a different theory to describe the empirical distribution of $\hatβ$ and $X\hatβ$: Approximations of $(\hatβ,X\hatβ)$ in terms of proximal operators are provided that only involve observable adjustments. These proposed observable adjustments are data-driven, e.g., do not require prior knowledge of the index or the link function. These new adjustments yield confidence intervals for individual components of the index, as well as estimators of the correlation of $\hatβ$ with the index. The interplay between loss, regularization and the model is thus captured in a data-driven manner, without solving the fixed-point equations studied in previous works. The results apply to both strongly convex regularizers and unregularized M-estimation. Simulations are provided for the square and logistic loss in single index models including logistic regression and 1-bit compressed sensing with 20\% corrupted bits.

翻译：我们考虑来自具有未知链接函数、高斯协变量和由凸损失函数与正则化器构造的正则化M估计量 $\hatβ$ 的单指标模型的观测值 $(X,y)$。在样本量 $n$ 和维度 $p$ 均增加使得 $p/n$ 具有有限极限的体系中，$\hatβ$ 的经验分布和预测值 $X\hatβ$ 的行为已在多个模型中得到先前表征：已知经验分布收敛到相关高斯序列模型中损失函数和惩罚项的邻近算子，这捕捉了比率 $p/n$、损失函数、正则化以及数据生成过程之间的相互作用。$(\hatβ,X\hatβ)$ 与相应邻近算子之间的这种联系需要求解通常涉及不可观测量的不动点方程，例如指标的先验分布或链接函数。本文发展了一种不同的理论来描述 $\hatβ$ 和 $X\hatβ$ 的经验分布：提供了 $(\hatβ,X\hatβ)$ 的近似，这些近似仅涉及可观测调整的邻近算子。这些提出的可观测调整是数据驱动的，例如，不需要事先知道指标或链接函数。这些新的调整产生了指标各个分量的置信区间，以及 $\hatβ$ 与指标相关性的估计量。因此，损失函数、正则化与模型之间的相互作用以数据驱动的方式被捕捉，而无需求解先前工作中研究的那些不动点方程。结果适用于强凸正则化器和无正则化的M估计。针对单指标模型中的平方损失和逻辑损失提供了模拟，包括逻辑回归和具有20%损坏比特的1比特压缩感知。