Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models

Consistency models (CMs) are a powerful class of diffusion-based generative models optimized for fast sampling. Most existing CMs are trained using discretized timesteps, which introduce additional hyperparameters and are prone to discretization errors. While continuous-time formulations can mitigate these issues, their success has been limited by training instability. To address this, we propose a simplified theoretical framework that unifies previous parameterizations of diffusion models and CMs, identifying the root causes of instability. Based on this analysis, we introduce key improvements in diffusion process parameterization, network architecture, and training objectives. These changes enable us to train continuous-time CMs at an unprecedented scale, reaching 1.5B parameters on ImageNet 512x512. Our proposed training algorithm, using only two sampling steps, achieves FID scores of 2.06 on CIFAR-10, 1.48 on ImageNet 64x64, and 1.88 on ImageNet 512x512, narrowing the gap in FID scores with the best existing diffusion models to within 10%.

翻译：一致性模型（CMs）是一类基于扩散的生成模型，专为快速采样优化。现有的一致性模型大多采用离散时间步进行训练，这引入了额外的超参数且易产生离散化误差。虽然连续时间公式可以缓解这些问题，但其成功一直受限于训练不稳定性。为解决此问题，我们提出了一个简化的理论框架，统一了先前扩散模型与一致性模型的参数化方法，并识别了不稳定性的根本原因。基于此分析，我们在扩散过程参数化、网络架构和训练目标方面引入了关键改进。这些改进使我们能够以前所未有的规模训练连续时间一致性模型，在ImageNet 512x512数据集上达到了15亿参数。我们提出的训练算法仅使用两个采样步骤，在CIFAR-10上实现了2.06的FID分数，在ImageNet 64x64上实现了1.48，在ImageNet 512x512上实现了1.88，将FID分数与现有最佳扩散模型的差距缩小至10%以内。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日