Data augmentation has long been a cornerstone for reducing overfitting in vision models, with methods like AutoAugment automating the design of task-specific augmentations. Recent advances in generative models, such as conditional diffusion and few-shot NeRFs, offer a new paradigm for data augmentation by synthesizing data with significantly greater diversity and realism. However, unlike traditional augmentations like cropping or rotation, these methods introduce substantial changes that enhance robustness but also risk degrading performance if the augmentations are poorly matched to the task. In this work, we present EvoAug, an automated augmentation learning pipeline, which leverages these generative models alongside an efficient evolutionary algorithm to learn optimal task-specific augmentations. Our pipeline introduces a novel approach to image augmentation that learns stochastic augmentation trees that hierarchically compose augmentations, enabling more structured and adaptive transformations. We demonstrate strong performance across fine-grained classification and few-shot learning tasks. Notably, our pipeline discovers augmentations that align with domain knowledge, even in low-data settings. These results highlight the potential of learned generative augmentations, unlocking new possibilities for robust model training.
翻译:数据增强长期以来一直是减少视觉模型过拟合的关键技术,AutoAugment等方法实现了任务特定增强策略的自动化设计。条件扩散模型和少样本NeRF等生成模型的最新进展,通过合成具有显著更高多样性和真实性的数据,为数据增强提供了新范式。然而,与裁剪或旋转等传统增强方法不同,这些方法引入的显著变化虽能增强鲁棒性,但若增强策略与任务匹配不佳,也可能导致性能下降。本研究提出EvoAug自动化增强学习流程,该流程利用生成模型结合高效进化算法,学习最优的任务特定增强策略。我们的流程引入了一种新颖的图像增强方法,通过学习随机增强树实现增强策略的层次化组合,从而支持更具结构化和自适应性的变换。我们在细粒度分类和少样本学习任务中展示了卓越的性能。值得注意的是,即使在低数据场景下,我们的流程也能发现与领域知识相符的增强策略。这些结果凸显了学习型生成增强的潜力,为鲁棒模型训练开辟了新的可能性。