Recently, the growing capabilities of deep generative models have underscored their potential in enhancing image classification accuracy. However, existing methods often demand the generation of a disproportionately large number of images compared to the original dataset, while having only marginal improvements in accuracy. This computationally expensive and time-consuming process hampers the practicality of such approaches. In this paper, we propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model. With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation. It aims to create images akin to the challenging or misclassified samples encountered by the current model and incorporates these generated images into the training set to augment model performance. ActGen introduces an attentive image guidance technique, using real images as guides during the denoising process of a diffusion model. The model's attention on class prompt is leveraged to ensure the preservation of similar foreground object while diversifying the background. Furthermore, we introduce a gradient-based generation guidance method, which employs two losses to generate more challenging samples and prevent the generated images from being too similar to previously generated ones. Experimental results on the CIFAR and ImageNet datasets demonstrate that our method achieves better performance with a significantly reduced number of generated images.
翻译:近年来,深度生成模型能力的持续提升凸显了其在增强图像分类准确性方面的潜力。然而现有方法往往需要生成远超原始数据集规模的大量图像,却仅有边际性的精度提升。这种计算成本高昂且耗时巨大的过程严重阻碍了此类方法的实用性。本文提出聚焦模型特定需求与特征来提升图像生成效率。基于主动学习的核心思想,我们提出的ActGen方法采用训练感知式图像生成策略。该方法旨在生成类似于当前模型遇到的困难样本或误分类样本,并将其纳入训练集以增强模型性能。ActGen引入了注意力引导图像生成技术,在扩散模型的去噪过程中以真实图像为引导。通过利用模型对类别提示词的注意力机制,在保持相似前景物体的同时实现背景多样化。此外,我们提出基于梯度的生成引导方法,通过两种损失函数生成更具挑战性的样本,并防止生成图像与先前图像过度相似。在CIFAR和ImageNet数据集上的实验表明,本方法在显著减少生成图像数量的情况下实现了更优性能。