Acquiring high-quality data for training discriminative models is a crucial yet challenging aspect of building effective predictive systems. In this paper, we present Diffusion Inversion, a simple yet effective method that leverages the pre-trained generative model, Stable Diffusion, to generate diverse, high-quality training data for image classification. Our approach captures the original data distribution and ensures data coverage by inverting images to the latent space of Stable Diffusion, and generates diverse novel training images by conditioning the generative model on noisy versions of these vectors. We identify three key components that allow our generated images to successfully supplant the original dataset, leading to a 2-3x enhancement in sample complexity and a 6.5x decrease in sampling time. Moreover, our approach consistently outperforms generic prompt-based steering methods and KNN retrieval baseline across a wide range of datasets. Additionally, we demonstrate the compatibility of our approach with widely-used data augmentation techniques, as well as the reliability of the generated data in supporting various neural architectures and enhancing few-shot learning.
翻译:获取高质量数据以训练判别模型是构建有效预测系统的关键且具有挑战性的环节。本文提出扩散反转(Diffusion Inversion)方法,这是一种简单而有效的技术,利用预训练生成模型Stable Diffusion生成多样化的高质量图像分类训练数据。该方法通过将图像反转到Stable Diffusion的潜在空间来捕获原始数据分布并确保数据覆盖,同时通过将生成模型以这些向量的含噪版本为条件生成多样化的新颖训练图像。我们确定了三个关键要素,使生成的图像能够成功替代原始数据集,实现样本复杂度提升2-3倍、采样时间减少6.5倍的效果。此外,该方法在多种数据集上始终优于通用的提示导向方法及KNN检索基线。同时,我们验证了该方法与广泛使用的数据增强技术具有兼容性,并证明了生成数据在支持多种神经架构及增强小样本学习方面的可靠性。