We present ObjBlur, a novel curriculum learning approach to improve layout-to-image generation models, where the task is to produce realistic images from layouts composed of boxes and labels. Our method is based on progressive object-level blurring, which effectively stabilizes training and enhances the quality of generated images. This curriculum learning strategy systematically applies varying degrees of blurring to individual objects or the background during training, starting from strong blurring to progressively cleaner images. Our findings reveal that this approach yields significant performance improvements, stabilized training, smoother convergence, and reduced variance between multiple runs. Moreover, our technique demonstrates its versatility by being compatible with generative adversarial networks and diffusion models, underlining its applicability across various generative modeling paradigms. With ObjBlur, we reach new state-of-the-art results on the complex COCO and Visual Genome datasets.
翻译:我们提出ObjBlur,一种新颖的课程学习方法,旨在改进布局到图像生成模型,其任务是从由边界框和标签组成的布局生成逼真图像。该方法基于渐进式目标级模糊策略,有效稳定了训练过程并提升了生成图像的质量。该课程学习策略在训练过程中对单个目标或背景施加不同程度的模糊,从强模糊逐步过渡到清晰图像。我们的研究发现,这种方法能带来显著的性能提升、训练稳定性增强、收敛更平滑以及多次运行间的方差降低。此外,本技术具有通用性,可与生成对抗网络和扩散模型兼容,凸显其在不同生成建模范式中的适用性。借助ObjBlur,我们在复杂的COCO和Visual Genome数据集上取得了新的最优结果。