Generative Fractional Diffusion Models

Gabriel Nobis,Maximilian Springenberg,Marco Aversa,Michael Detzel,Rembert Daems,Roderick Murray-Smith,Shinichi Nakajima,Sebastian Lapuschkin,Stefano Ermon,Tolga Birdal,Manfred Opper,Christoph Knochenhauer,Luis Oala,Wojciech Samek

We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics. Although diffusion models have excelled at capturing data distributions, they still suffer from various limitations such as slow convergence, mode-collapse on imbalanced data, and lack of diversity. These issues are partially linked to the use of light-tailed Brownian motion (BM) with independent increments. In this paper, we replace BM with an approximation of its non-Markovian counterpart, fractional Brownian motion (fBM), characterized by correlated increments and Hurst index $H \in (0,1)$, where $H=1/2$ recovers the classical BM. To ensure tractable inference and learning, we employ a recently popularized Markov approximation of fBM (MA-fBM) and derive its reverse time model, resulting in generative fractional diffusion models (GFDMs). We characterize the forward dynamics using a continuous reparameterization trick and propose an augmented score matching loss to efficiently learn the score-function, which is partly known in closed form, at minimal added cost. The ability to drive our diffusion model via fBM provides flexibility and control. $H \leq 1/2$ enters the regime of rough paths whereas $H>1/2$ regularizes diffusion paths and invokes long-term memory as well as a heavy-tailed behaviour (super-diffusion). The Markov approximation allows added control by varying the number of Markov processes linearly combined to approximate fBM. Our evaluations on real image datasets demonstrate that GFDM achieves greater pixel-wise diversity and enhanced image quality, as indicated by a lower FID, offering a promising alternative to traditional diffusion models.

翻译：我们提出了首个利用分数扩散过程作为其底层动力学的连续时间基于分数的生成模型。尽管扩散模型在捕捉数据分布方面表现出色，但它们仍存在各种局限性，如收敛速度慢、在非平衡数据上出现模式崩溃以及缺乏多样性。这些问题部分归因于使用了具有独立增量的轻尾布朗运动。在本文中，我们用其非马尔可夫对应物——分数布朗运动的近似来替代布朗运动，分数布朗运动以相关增量和赫斯特指数 $H \in (0,1)$ 为特征，其中 $H=1/2$ 对应经典的布朗运动。为确保可处理的推断和学习，我们采用了一种近期流行的分数布朗运动马尔可夫近似，并推导出其反向时间模型，从而得到生成式分数扩散模型。我们使用连续重参数化技巧来刻画前向动力学，并提出了一种增强的分数匹配损失，以最小的额外成本高效地学习分数函数（其部分解析形式已知）。通过分数布朗运动驱动扩散模型的能力提供了灵活性和控制力。$H \leq 1/2$ 进入粗糙路径机制，而 $H>1/2$ 则正则化扩散路径并引入长期记忆以及重尾行为（超扩散）。马尔可夫近似通过改变线性组合以近似分数布朗运动的马尔可夫过程数量，提供了额外的控制能力。我们在真实图像数据集上的评估表明，生成式分数扩散模型实现了更高的像素级多样性和更优的图像质量（表现为更低的FID分数），为传统扩散模型提供了一个有前景的替代方案。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日