We study a setting where the goal is to learn a target function f(x) with respect to a target distribution D(x), but training is done on i.i.d. samples from a different training distribution D'(x), labeled by the true target f(x). Such a distribution shift (here in the form of covariate shift) is usually viewed negatively, as hurting or making learning harder, and the traditional distribution shift literature is mostly concerned with limiting or avoiding this negative effect. In contrast, we argue that with a well-chosen D'(x), the shift can be positive and make learning easier -- a perspective called Positive Distribution Shift (PDS). Such a perspective is central to contemporary machine learning, where much of the innovation is in finding good training distributions D'(x), rather than changing the training algorithm. We further argue that the benefit is often computational rather than statistical, and that PDS allows computationally hard problems to become tractable even using standard gradient-based training. We formalize different variants of PDS, show how certain hard classes are easily learnable under PDS, and make connections with membership query learning.
翻译:我们研究一种学习场景:目标是在目标分布D(x)下学习目标函数f(x),但训练数据来自不同训练分布D'(x)的独立同分布样本,并由真实目标函数f(x)标注标签。此类分布偏移(此处以协变量偏移形式呈现)通常被视为负面因素,会损害或增加学习难度,传统分布偏移研究主要关注如何限制或避免这种负面影响。与此相反,我们认为通过精心选择D'(x),这种偏移可以产生正向效应并降低学习难度——这一视角被称为正向分布偏移。该视角在当代机器学习中具有核心地位,当前大量创新工作聚焦于寻找优质训练分布D'(x),而非改进训练算法。我们进一步论证这种优势往往体现在计算层面而非统计层面,正向分布偏移能使计算困难问题通过标准梯度训练变得可解。我们形式化了正向分布偏移的不同变体,展示了特定困难类别在正向分布偏移下的易学习性,并建立了与成员查询学习理论的联系。