Consistency Models - 专知论文

Diffusion models have made significant breakthroughs in image, audio, and video generation, but they depend on an iterative generation process that causes slow sampling speed and caps their potential for real-time applications. To overcome this limitation, we propose consistency models, a new family of generative models that achieve high sample quality without adversarial training. They support fast one-step generation by design, while still allowing for few-step sampling to trade compute for sample quality. They also support zero-shot data editing, like image inpainting, colorization, and super-resolution, without requiring explicit training on these tasks. Consistency models can be trained either as a way to distill pre-trained diffusion models, or as standalone generative models. Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step generation. For example, we achieve the new state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 for one-step generation. When trained as standalone generative models, consistency models also outperform single-step, non-adversarial generative models on standard benchmarks like CIFAR-10, ImageNet 64x64 and LSUN 256x256.

翻译：扩散模型在图像、音频和视频生成领域取得了重大突破，但其依赖迭代生成过程，导致采样速度缓慢，限制了其在实时应用中的潜力。为克服这一局限，我们提出一致性模型——一类无需对抗训练即可实现高样本质量的新型生成模型。该模型设计上支持快速单步生成，同时允许通过少量步骤的采样在计算量与样本质量之间进行权衡。此外，它还能支持零样本数据编辑任务（如图像修复、着色和超分辨率），而无需针对这些任务进行显式训练。一致性模型既可作为预训练扩散模型的蒸馏方法，也可作为独立的生成模型进行训练。通过大量实验，我们证明了该方法在单步和少步生成中优于现有的扩散模型蒸馏技术。例如，在单步生成任务中，我们在CIFAR-10上取得了FID为3.55的最新最优结果，在ImageNet 64x64上取得了6.20的FID值。当作为独立生成模型训练时，一致性模型在CIFAR-10、ImageNet 64x64及LSUN 256x256等标准基准测试中，也优于单步非对抗生成模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日