We introduce a general framework for generating diverse visual content, including ambiguous images, panorama images, mesh textures, and Gaussian splat textures, by synchronizing multiple diffusion processes. We present exhaustive investigation into all possible scenarios for synchronizing multiple diffusion processes through a canonical space and analyze their characteristics across applications. In doing so, we reveal a previously unexplored case: averaging the outputs of Tweedie's formula while conducting denoising in multiple instance spaces. This case also provides the best quality with the widest applicability to downstream tasks. We name this case SyncTweedies. In our experiments generating visual content aforementioned, we demonstrate the superior quality of generation by SyncTweedies compared to other synchronization methods, optimization-based and iterative-update-based methods.
翻译:本文提出了一种通用框架,通过同步多个扩散过程来生成多样化的视觉内容,包括模糊图像、全景图像、网格纹理和高斯泼溅纹理。我们对通过规范空间同步多个扩散过程的所有可能场景进行了详尽研究,并分析了它们在不同应用中的特性。在此过程中,我们揭示了一个此前未被探索的情况:即在多个实例空间中进行去噪时,对Tweedie公式的输出取平均。该情况在所有下游任务中兼具最佳质量和最广适用性,我们将此情况命名为SyncTweedies。在生成上述视觉内容的实验中,我们证明了SyncTweedies相比其他同步方法、基于优化的方法和基于迭代更新的方法具有更优的生成质量。