Diffusion models (DMs) have enabled breakthroughs in image synthesis tasks but lack an intuitive interface for consistent image-to-image (I2I) translation. Various methods have been explored to address this issue, including mask-based methods, attention-based methods, and image-conditioning. However, it remains a critical challenge to enable unpaired I2I translation with pre-trained DMs while maintaining satisfying consistency. This paper introduces Cyclenet, a novel but simple method that incorporates cycle consistency into DMs to regularize image manipulation. We validate Cyclenet on unpaired I2I tasks of different granularities. Besides the scene and object level translation, we additionally contribute a multi-domain I2I translation dataset to study the physical state changes of objects. Our empirical studies show that Cyclenet is superior in translation consistency and quality, and can generate high-quality images for out-of-domain distributions with a simple change of the textual prompt. Cyclenet is a practical framework, which is robust even with very limited training data (around 2k) and requires minimal computational resources (1 GPU) to train. Project homepage: https://cyclenetweb.github.io/
翻译:扩散模型(DMs)已在图像合成任务中取得突破性进展,但缺乏用于一致性图像到图像(I2I)翻译的直观接口。为应对该问题,研究者探索了多种方法,包括基于掩膜的方法、基于注意力的方法以及图像条件方法。然而,在利用预训练扩散模型实现非配对I2I翻译的同时保持令人满意的一致性,仍是一项关键挑战。本文提出Cyclenet——一种新颖而简洁的方法,通过将循环一致性引入扩散模型以规范图像编辑操作。我们在不同粒度的非配对I2I任务上验证了Cyclenet。除场景级与物体级翻译外,我们额外贡献了一个多域I2I翻译数据集,用于研究物体的物理状态变化。实证研究表明,Cyclenet在翻译一致性与质量方面表现优越,且仅需简单修改文本提示即可为域外分布生成高质量图像。Cyclenet是一种实用框架,即使在训练数据非常有限(约2000张)的情况下仍保持鲁棒性,且训练所需计算资源极低(1块GPU)。项目主页:https://cyclenetweb.github.io/