Diffusion models learn to reverse the progressive noising of a data distribution to create a generative model. However, the desired continuous nature of the noising process can be at odds with discrete data. To deal with this tension between continuous and discrete objects, we propose a method of performing diffusion on the probability simplex. Using the probability simplex naturally creates an interpretation where points correspond to categorical probability distributions. Our method uses the softmax function applied to an Ornstein-Unlenbeck Process, a well-known stochastic differential equation. We find that our methodology also naturally extends to include diffusion on the unit cube which has applications for bounded image generation.
翻译:扩散模型通过学习逆转数据分布的逐步加噪过程来构建生成模型。然而,理想的连续加噪过程可能与离散数据存在冲突。为处理连续对象与离散对象之间的这种张力,我们提出了一种在概率单纯形上进行扩散的方法。利用概率单纯形自然产生了这样一种解释:其中的点对应类别概率分布。我们的方法使用应用于奥恩斯坦-乌伦贝克过程(一种著名的随机微分方程)的softmax函数。我们发现,该方法也可自然扩展至单位立方体上的扩散,这对有界图像生成具有应用价值。