Data augmentation plays a pivotal role in enhancing and diversifying training data. Nonetheless, consistently improving model performance in varied learning scenarios, especially those with inherent data biases, remains challenging. To address this, we propose to augment the deep features of samples by incorporating their adversarial and anti-adversarial perturbation distributions, enabling adaptive adjustment in the learning difficulty tailored to each sample's specific characteristics. We then theoretically reveal that our augmentation process approximates the optimization of a surrogate loss function as the number of augmented copies increases indefinitely. This insight leads us to develop a meta-learning-based framework for optimizing classifiers with this novel loss, introducing the effects of augmentation while bypassing the explicit augmentation process. We conduct extensive experiments across four common biased learning scenarios: long-tail learning, generalized long-tail learning, noisy label learning, and subpopulation shift learning. The empirical results demonstrate that our method consistently achieves state-of-the-art performance, highlighting its broad adaptability.
翻译:数据增强在增强和多样化训练数据中起着关键作用。然而,在不同学习场景(尤其是存在固有数据偏差的场景)中持续提升模型性能仍具挑战性。为此,我们提出通过引入样本的对抗性和抗对抗扰动分布来扩充样本的深度特征,从而根据每个样本的具体特征自适应调整学习难度。理论分析表明,随着增强副本数量无限增加,我们的增强过程近似于优化一个替代损失函数。基于这一发现,我们开发了一种基于元学习的框架,利用这种新型损失函数优化分类器,在绕过显式增强过程的同时引入增强效应。我们在四种常见的偏差学习场景(长尾学习、广义长尾学习、噪声标签学习和子种群偏移学习)中进行了广泛实验。实验结果证明,我们的方法始终取得最优性能,凸显了其广泛的适应性。