We study a setting where the goal is to learn a target function f(x) with respect to a target distribution D(x), but training is done on i.i.d. samples from a different training distribution D'(x), labeled by the true target f(x). Such a distribution shift (here in the form of covariate shift) is usually viewed negatively, as hurting or making learning harder, and the traditional distribution shift literature is mostly concerned with limiting or avoiding this negative effect. In contrast, we argue that with a well-chosen D'(x), the shift can be positive and make learning easier -- a perspective called Positive Distribution Shift (PDS). Such a perspective is central to contemporary machine learning, where much of the innovation is in finding good training distributions D'(x), rather than changing the training algorithm. We further argue that the benefit is often computational rather than statistical, and that PDS allows computationally hard problems to become tractable even using standard gradient-based training. We formalize different variants of PDS, show how certain hard classes are easily learnable under PDS, and make connections with membership query learning.
翻译:本文研究一种学习场景:目标是在目标分布D(x)下学习目标函数f(x),但训练数据来自不同训练分布D'(x)的独立同分布样本,且样本由真实目标函数f(x)标注。此类分布偏移(此处表现为协变量偏移)通常被视为负面因素,会损害或增加学习难度,传统分布偏移研究主要关注如何限制或避免这种负面影响。与之相反,我们认为通过精心选择D'(x),偏移可以产生积极效应并简化学习过程——这一视角被称为正分布偏移(PDS)。该视角在当代机器学习中具有核心地位,当前大量创新在于寻找优质训练分布D'(x),而非改进训练算法。我们进一步论证这种优势往往体现在计算层面而非统计层面,PDS能使计算困难问题通过标准梯度训练变得可解。我们形式化定义了PDS的多种变体,展示了特定困难类别在PDS下的易学性,并建立了与成员查询学习的理论关联。