Classifier-free guidance is a key component for improving the performance of conditional generative models for many downstream tasks. It drastically improves the quality of samples produced, but has so far only been used for diffusion models. Flow Matching (FM), an alternative simulation-free approach, trains Continuous Normalizing Flows (CNFs) based on regressing vector fields. It remains an open question whether classifier-free guidance can be performed for Flow Matching models, and to what extent does it improve performance. In this paper, we explore the usage of Guided Flows for a variety of downstream applications involving conditional image generation, speech synthesis, and reinforcement learning. In particular, we are the first to apply flow models to the offline reinforcement learning setting. We also show that Guided Flows significantly improves the sample quality in image generation and zero-shot text-to-speech synthesis, and can make use of drastically low amounts of computation without affecting the agent's overall performance.
翻译:无分类器引导是提升条件生成模型在多项下游任务中性能的关键组件。它显著改善了生成样本的质量,但迄今为止仅用于扩散模型。流匹配(Flow Matching, FM)作为一种替代的无模拟方法,基于向量场回归训练连续归一化流(Continuous Normalizing Flows, CNFs)。目前尚不清楚无分类器引导能否应用于流匹配模型,以及它在多大程度上能提升性能。本文探索了导向流(Guided Flows)在条件图像生成、语音合成和强化学习等多种下游应用中的使用。特别地,我们是首个将流模型应用于离线强化学习场景的研究。我们还证明,导向流显著提升了图像生成和零样本文本转语音合成中的样本质量,并且能够在几乎不影响智能体整体性能的情况下,大幅降低计算资源使用。