Data augmentation is one of the regularization strategies for the training of deep learning models, which enhances generalizability and prevents overfitting, leading to performance improvement. Although researchers have proposed various data augmentation techniques, they often lack consideration for the difficulty of augmented data. Recently, another line of research suggests incorporating the concept of curriculum learning with data augmentation in the field of natural language processing. In this study, we adopt curriculum data augmentation for image data augmentation and propose colorful cutout, which gradually increases the noise and difficulty introduced in the augmented image. Our experimental results highlight the possibility of curriculum data augmentation for image data. We publicly released our source code to improve the reproducibility of our study.
翻译:数据扩充是深度学习模型训练中的正则化策略之一,它能够增强泛化能力并防止过拟合,从而提升模型性能。尽管研究人员已提出多种数据扩充技术,但这些方法往往缺乏对扩充数据难度的考量。近年来,另一条研究路线建议将课程学习的概念与自然语言处理领域的数据扩充相结合。在本研究中,我们将课程数据扩充方法应用于图像数据扩充,并提出"彩色剪影"方法,该方法逐步增强扩充图像中引入的噪声与难度。实验结果表明了课程数据扩充在图像数据上的可行性。为提升研究的可重复性,我们已公开发布源代码。