A single-index model (SIM) is a function of the form $\sigma(\mathbf{w}^{\ast} \cdot \mathbf{x})$, where $\sigma: \mathbb{R} \to \mathbb{R}$ is a known link function and $\mathbf{w}^{\ast}$ is a hidden unit vector. We study the task of learning SIMs in the agnostic (a.k.a. adversarial label noise) model with respect to the $L^2_2$-loss under the Gaussian distribution. Our main result is a sample and computationally efficient agnostic proper learner that attains $L^2_2$-error of $O(\mathrm{OPT})+\epsilon$, where $\mathrm{OPT}$ is the optimal loss. The sample complexity of our algorithm is $\tilde{O}(d^{\lceil k^{\ast}/2\rceil}+d/\epsilon)$, where $k^{\ast}$ is the information-exponent of $\sigma$ corresponding to the degree of its first non-zero Hermite coefficient. This sample bound nearly matches known CSQ lower bounds, even in the realizable setting. Prior algorithmic work in this setting had focused on learning in the realizable case or in the presence of semi-random noise. Prior computationally efficient robust learners required significantly stronger assumptions on the link function.
翻译:单指标模型(SIM)是形如$\sigma(\mathbf{w}^{\ast} \cdot \mathbf{x})$的函数,其中$\sigma: \mathbb{R} \to \mathbb{R}$为已知链接函数,$\mathbf{w}^{\ast}$为隐藏单位向量。我们研究在高斯分布下,针对$L^2_2$损失在不可知(亦称对抗性标签噪声)模型中学习SIM的任务。我们的主要成果是一个样本与计算高效的不可知适恰学习器,其能达到$O(\mathrm{OPT})+\epsilon$的$L^2_2$误差,其中$\mathrm{OPT}$为最优损失。该算法的样本复杂度为$\tilde{O}(d^{\lceil k^{\ast}/2\rceil}+d/\epsilon)$,其中$k^{\ast}$是$\sigma$的信息指数,对应于其第一个非零埃尔米特系数的阶数。该样本边界几乎匹配已知的CSQ下界,即使在可实现设置中亦然。先前在此设置下的算法研究主要聚焦于可实现情形或半随机噪声下的学习。以往计算高效的鲁棒学习器需要对链接函数施加显著更强的假设。