We revisit the fundamental problem of learning with distribution shift, in which a learner is given labeled samples from training distribution $D$, unlabeled samples from test distribution $D'$ and is asked to output a classifier with low test error. The standard approach in this setting is to bound the loss of a classifier in terms of some notion of distance between $D$ and $D'$. These distances, however, seem difficult to compute and do not lead to efficient algorithms. We depart from this paradigm and define a new model called testable learning with distribution shift, where we can obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution. In this model, a learner outputs a classifier with low test error whenever samples from $D$ and $D'$ pass an associated test; moreover, the test must accept if the marginal of $D$ equals the marginal of $D'$. We give several positive results for learning well-studied concept classes such as halfspaces, intersections of halfspaces, and decision trees when the marginal of $D$ is Gaussian or uniform on $\{\pm 1\}^d$. Prior to our work, no efficient algorithms for these basic cases were known without strong assumptions on $D'$. For halfspaces in the realizable case (where there exists a halfspace consistent with both $D$ and $D'$), we combine a moment-matching approach with ideas from active learning to simulate an efficient oracle for estimating disagreement regions. To extend to the non-realizable setting, we apply recent work from testable (agnostic) learning. More generally, we prove that any function class with low-degree $L_2$-sandwiching polynomial approximators can be learned in our model. We apply constructions from the pseudorandomness literature to obtain the required approximators.
翻译:我们重新审视了分布偏移下的学习基本问题:在此问题中,学习器获得来自训练分布 $D$ 的带标签样本和来自测试分布 $D'$ 的无标签样本,并被要求输出一个在测试误差上表现良好的分类器。该问题的标准方法是利用 $D$ 与 $D'$ 之间某种距离的概念来约束分类器的损失。然而,这些距离的求解似乎非常困难,且无法衍生出高效的算法。我们跳出这一范式,定义了一种名为“面向分布偏移的可测试学习”的新模型,通过该模型可以获得在测试分布上验证分类器性能的可证明高效算法。在此模型中,只要来自 $D$ 和 $D'$ 的样本通过关联的检验,学习器便能输出一个测试误差较低的分类器;此外,当 $D$ 的边缘分布等于 $D'$ 的边缘分布时,该检验必须接受。我们在 $D$ 的边缘分布为高斯分布或 $\{\pm 1\}^d$ 均匀分布的情况下,针对半空间、半空间交集以及决策树等经典概念类学习取得了多项正面结果。在本文工作之前,若不依赖对 $D'$ 的强假设,这些基本情形尚无已知的高效算法。对于可实现情形(即存在一个同时与 $D$ 和 $D'$ 一致的半空间)下的半空间学习,我们将矩匹配方法与主动学习的思路相结合,模拟了一个用于估计不一致区域的高效访问函数。为了将结果扩展到不可实现情形,我们采用了近期可测试(不可知)学习领域的研究成果。更一般地,我们证明任何具有低阶 $L_2$-夹层多项式逼近器的函数类都能在我们的模型中被学习。为此,我们应用伪随机领域中的构造方法来获取所需的逼近器。