Modern neural networks are over-parameterized and thus rely on strong regularization such as data augmentation and weight decay to reduce overfitting and improve generalization. The dominant form of data augmentation applies invariant transforms, where the learning target of a sample is invariant to the transform applied to that sample. We draw inspiration from human visual classification studies and propose generalizing augmentation with invariant transforms to soft augmentation where the learning target softens non-linearly as a function of the degree of the transform applied to the sample: e.g., more aggressive image crop augmentations produce less confident learning targets. We demonstrate that soft targets allow for more aggressive data augmentation, offer more robust performance boosts, work with other augmentation policies, and interestingly, produce better calibrated models (since they are trained to be less confident on aggressively cropped/occluded examples). Combined with existing aggressive augmentation strategies, soft target 1) doubles the top-1 accuracy boost across Cifar-10, Cifar-100, ImageNet-1K, and ImageNet-V2, 2) improves model occlusion performance by up to $4\times$, and 3) halves the expected calibration error (ECE). Finally, we show that soft augmentation generalizes to self-supervised classification tasks. Code available at https://github.com/youngleox/soft_augmentation
翻译:现代神经网络普遍存在参数过饱和现象,因此依赖数据增强和权重衰减等强正则化手段来减少过拟合并提升泛化能力。当前主流的增强方式采用不变变换,即样本的学习目标与其所经历的增强变换无关。受人类视觉分类研究的启发,我们提出将基于不变变换的数据增强泛化为软增强方法——学习目标会随增强变换强度的增加呈非线性软化:例如,对图像进行更激进的裁剪增强时,对应的学习目标置信度会降低。实验表明,软目标方法能够支持更强的数据增强策略,带来更鲁棒的性能提升,可与其他增强策略协同工作,并有趣地产出更佳校准的模型(因其在强裁剪/遮挡样本上被训练为降低置信度)。结合现有激进增强策略,软目标方法:1)在Cifar-10、Cifar-100、ImageNet-1K和ImageNet-V2数据集上将top-1准确率提升幅度翻倍;2)将模型遮挡性能提升最高达4倍;3)使期望校准误差(ECE)减半。最后,我们证明软增强方法可泛化至自监督分类任务。代码开源于:https://github.com/youngleox/soft_augmentation