This paper attempts to address the object repetition issue in patch-wise higher-resolution image generation. We propose AccDiffusion, an accurate method for patch-wise higher-resolution image generation without training. An in-depth analysis in this paper reveals an identical text prompt for different patches causes repeated object generation, while no prompt compromises the image details. Therefore, our AccDiffusion, for the first time, proposes to decouple the vanilla image-content-aware prompt into a set of patch-content-aware prompts, each of which serves as a more precise description of an image patch. Besides, AccDiffusion also introduces dilated sampling with window interaction for better global consistency in higher-resolution image generation. Experimental comparison with existing methods demonstrates that our AccDiffusion effectively addresses the issue of repeated object generation and leads to better performance in higher-resolution image generation. Our code is released at \url{https://github.com/lzhxmu/AccDiffusion}.
翻译:本文旨在解决基于分块的高分辨率图像生成中的物体重复问题。我们提出AccDiffusion,一种无需训练即可实现分块高分辨率图像生成的精确方法。本文的深入分析表明,对不同图像块使用相同的文本提示会导致重复物体生成,而无提示则会损害图像细节。因此,我们的AccDiffusion首次提出将原始图像内容感知提示解耦为一组图像块内容感知提示,每个提示作为对图像块更精确的描述。此外,AccDiffusion还引入了带窗口交互的扩张采样策略,以提升高分辨率图像生成中的全局一致性。与现有方法的实验对比表明,我们的AccDiffusion有效解决了重复物体生成问题,并在高分辨率图像生成中取得了更优性能。代码发布于 \url{https://github.com/lzhxmu/AccDiffusion}。