We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics. Although diffusion models have excelled at capturing data distributions, they still suffer from various limitations such as slow convergence, mode-collapse on imbalanced data, and lack of diversity. These issues are partially linked to the use of light-tailed Brownian motion (BM) with independent increments. In this paper, we replace BM with an approximation of its non-Markovian counterpart, fractional Brownian motion (fBM), characterized by correlated increments and Hurst index $H \in (0,1)$, where $H=1/2$ recovers the classical BM. To ensure tractable inference and learning, we employ a recently popularized Markov approximation of fBM (MA-fBM) and derive its reverse time model, resulting in generative fractional diffusion models (GFDMs). We characterize the forward dynamics using a continuous reparameterization trick and propose an augmented score matching loss to efficiently learn the score-function, which is partly known in closed form, at minimal added cost. The ability to drive our diffusion model via fBM provides flexibility and control. $H \leq 1/2$ enters the regime of rough paths whereas $H>1/2$ regularizes diffusion paths and invokes long-term memory as well as a heavy-tailed behaviour (super-diffusion). The Markov approximation allows added control by varying the number of Markov processes linearly combined to approximate fBM. Our evaluations on real image datasets demonstrate that GFDM achieves greater pixel-wise diversity and enhanced image quality, as indicated by a lower FID, offering a promising alternative to traditional diffusion models.
翻译:我们提出了首个利用分数扩散过程作为其底层动力学的连续时间基于分数的生成模型。尽管扩散模型在捕捉数据分布方面表现出色,但它们仍存在各种局限性,如收敛速度慢、在非平衡数据上出现模式崩溃以及缺乏多样性。这些问题部分归因于使用了具有独立增量的轻尾布朗运动。在本文中,我们用其非马尔可夫对应物——分数布朗运动的近似来替代布朗运动,分数布朗运动以相关增量和赫斯特指数 $H \in (0,1)$ 为特征,其中 $H=1/2$ 对应经典的布朗运动。为确保可处理的推断和学习,我们采用了一种近期流行的分数布朗运动马尔可夫近似,并推导出其反向时间模型,从而得到生成式分数扩散模型。我们使用连续重参数化技巧来刻画前向动力学,并提出了一种增强的分数匹配损失,以最小的额外成本高效地学习分数函数(其部分解析形式已知)。通过分数布朗运动驱动扩散模型的能力提供了灵活性和控制力。$H \leq 1/2$ 进入粗糙路径机制,而 $H>1/2$ 则正则化扩散路径并引入长期记忆以及重尾行为(超扩散)。马尔可夫近似通过改变线性组合以近似分数布朗运动的马尔可夫过程数量,提供了额外的控制能力。我们在真实图像数据集上的评估表明,生成式分数扩散模型实现了更高的像素级多样性和更优的图像质量(表现为更低的FID分数),为传统扩散模型提供了一个有前景的替代方案。