Generating high-quality labeled image datasets is crucial for training accurate and robust machine learning models in the field of computer vision. However, the process of manually labeling real images is often time-consuming and costly. To address these challenges associated with dataset generation, we introduce "DiffuGen," a simple and adaptable approach that harnesses the power of stable diffusion models to create labeled image datasets efficiently. By leveraging stable diffusion models, our approach not only ensures the quality of generated datasets but also provides a versatile solution for label generation. In this paper, we present the methodology behind DiffuGen, which combines the capabilities of diffusion models with two distinct labeling techniques: unsupervised and supervised. Distinctively, DiffuGen employs prompt templating for adaptable image generation and textual inversion to enhance diffusion model capabilities.
翻译:生成高质量的标注图像数据集对训练计算机视觉领域准确且鲁棒的机器学习模型至关重要。然而,人工标注真实图像的过程往往耗时且成本高昂。为解决数据集生成中的这些挑战,我们提出“DiffuGen”——一种简单且自适应的方案,通过利用稳定扩散模型高效生成标注图像数据集。该方法不仅借助稳定扩散模型确保生成数据集的质量,还提供了一种通用的标签生成方案。本文阐述了DiffuGen背后的方法论,其将扩散模型的能力与两种不同的标注技术(无监督与监督)相结合。特别地,DiffuGen采用提示模板实现自适应图像生成,并通过文本反演增强扩散模型的能力。