解耦均值流：将流模型转化为流映射以实现加速采样 (Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling)

Denoising generative models, such as diffusion and flow-based models, produce high-quality samples but require many denoising steps due to discretization error. Flow maps, which estimate the average velocity between timesteps, mitigate this error and enable faster sampling. However, their training typically demands architectural changes that limit compatibility with pretrained flow models. We introduce Decoupled MeanFlow, a simple decoding strategy that converts flow models into flow map models without architectural modifications. Our method conditions the final blocks of diffusion transformers on the subsequent timestep, allowing pretrained flow models to be directly repurposed as flow maps. Combined with enhanced training techniques, this design enables high-quality generation in as few as 1 to 4 steps. Notably, we find that training flow models and subsequently converting them is more efficient and effective than training flow maps from scratch. On ImageNet 256x256 and 512x512, our models attain 1-step FID of 2.16 and 2.12, respectively, surpassing prior art by a large margin. Furthermore, we achieve FID of 1.51 and 1.68 when increasing the steps to 4, which nearly matches the performance of flow models while delivering over 100x faster inference.

翻译：去噪生成模型，如扩散模型和基于流的模型，能够生成高质量样本，但由于离散化误差，需要许多去噪步骤。流映射通过估计时间步之间的平均速度来缓解这一误差，从而实现更快的采样。然而，其训练通常需要架构调整，这限制了与预训练流模型的兼容性。我们提出了解耦均值流，一种简单的解码策略，无需架构修改即可将流模型转换为流映射模型。我们的方法将扩散Transformer的最后几块条件化于后续时间步，使得预训练的流模型可以直接重新用作流映射。结合增强的训练技术，该设计能够在少至1到4步内实现高质量生成。值得注意的是，我们发现先训练流模型再将其转换，比从头训练流映射更高效且效果更佳。在ImageNet 256x256和512x512数据集上，我们的模型分别实现了单步FID为2.16和2.12，大幅超越了现有技术。此外，当步数增加到4步时，我们实现了FID为1.51和1.68，这几乎与流模型的性能相当，同时推理速度提升了超过100倍。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日