Image-mixing augmentations (e.g., Mixup and CutMix), which typically involve mixing two images, have become the de-facto training techniques for image classification. Despite their huge success in image classification, the number of images to be mixed has not been elucidated in the literature: only the naive K-image expansion has been shown to lead to performance degradation. This study derives a new K-image mixing augmentation based on the stick-breaking process under Dirichlet prior distribution. We demonstrate the superiority of our K-image expansion augmentation over conventional two-image mixing augmentation methods through extensive experiments and analyses: (1) more robust and generalized classifiers; (2) a more desirable loss landscape shape; (3) better adversarial robustness. Moreover, we show that our probabilistic model can measure the sample-wise uncertainty and boost the efficiency for network architecture search by achieving a 7-fold reduction in the search time. Code will be available at https://github.com/yjyoo3312/DCutMix-PyTorch.git.
翻译:图像混合增强(例如Mixup和CutMix)通常涉及混合两张图像,已成为图像分类领域事实上的训练技术。尽管它们在图像分类中取得了巨大成功,但文献中尚未阐明应混合的图像数量:仅朴素的K图像扩展已被证明会导致性能下降。本研究基于狄利克雷先验分布下的碎棍过程,提出了一种新的K图像混合增强方法。通过大量实验和分析,我们证明了所提出的K图像扩展增强优于传统的两图像混合增强方法:(1) 更鲁棒且泛化能力更强的分类器;(2) 更理想的损失景观形状;(3) 更好的对抗鲁棒性。此外,我们展示了概率模型能够度量样本级的不确定性,并通过将网络架构搜索时间减少7倍来提升搜索效率。代码将发布于https://github.com/yjyoo3312/DCutMix-PyTorch.git。