We study the task of agnostically learning general (as opposed to homogeneous) ReLUs under the Gaussian distribution with respect to the squared loss. In the passive learning setting, recent work gave a computationally efficient algorithm that uses $poly(d,1/ε)$ labeled examples and outputs a hypothesis with error $O(opt)+ε$, where $opt$ is the squared loss of the best fit ReLU. Here we focus on the interactive setting, where the learner has some form of query access to the labels of unlabeled examples. Our main result is the first computationally efficient learner that uses $d polylog(1/ε)+\tilde{O}(\min\{1/p, 1/ε\})$ black-box label queries, where $p$ is the bias of the target function, and achieves error $O(opt)+ε$. We complement our algorithmic result by showing that its query complexity bound is qualitatively near-optimal, even ignoring computational constraints. Finally, we establish that query access is essentially necessary to improve on the label complexity of passive learning. Specifically, for pool-based active learning, any active learner requires $\tildeΩ(d/ε)$ labels, unless it draws a super-polynomial number of unlabeled examples.
翻译:我们研究了在平方损失下,针对高斯分布进行非齐次(相对于齐次)通用ReLU函数的不可知学习任务。在被动学习设置中,近期工作提出了一种计算高效的算法,该算法使用$poly(d,1/ε)$个带标签样本,并输出误差为$O(opt)+ε$的假设,其中$opt$是最优拟合ReLU的平方损失。本文重点关注交互式设置,学习者可以通过某种形式的查询访问未标注样本的标签。我们的主要成果是首个计算高效的学习器,它使用$d polylog(1/ε)+\tilde{O}(\min\{1/p, 1/ε\})$个黑盒标签查询(其中$p$为目标函数的偏置),并达到$O(opt)+ε$的误差。我们通过证明其查询复杂度边界在定性上接近最优(即使忽略计算约束)来补充这一算法结果。最后,我们确立了查询访问对于改进被动学习标签复杂度本质上不可或缺:对于基于池的主动学习,任何主动学习器都需要$\tildeΩ(d/ε)$个标签,除非它抽取超多项式数量的未标注样本。