Mixed sample data augmentation (MSDA) is a widely used technique that has been found to improve performance in a variety of tasks. However, in this paper, we show that the effects of MSDA are class-dependent, with some classes seeing an improvement in performance while others experience a decline. To reduce class dependency, we propose the DropMix method, which excludes a specific percentage of data from the MSDA computation. By training on a combination of MSDA and non-MSDA data, the proposed method not only improves the performance of classes that were previously degraded by MSDA, but also increases overall average accuracy, as shown in experiments on two datasets (CIFAR-100 and ImageNet) using three MSDA methods (Mixup, CutMix and PuzzleMix).
翻译:混合样本数据增强(MSDA)是一种广泛应用的技术,已被发现能提升多种任务的性能。然而,本文表明MSDA的效果具有类别依赖性:某些类别的性能得到提升,而其他类别则出现下降。为降低这种类别依赖性,我们提出了DropMix方法,该方法从MSDA计算中排除特定百分比的数据。通过结合MSDA和非MSDA数据训练,所提方法不仅改善了先前因MSDA而性能下降的类别,还提升了整体平均准确率——在CIFAR-100和ImageNet两个数据集上,使用三种MSDA方法(Mixup、CutMix和PuzzleMix)的实验均验证了这一点。