Diffusion models (DMs), which enable both image generation from noise and inversion from data, have inspired powerful unpaired image-to-image (I2I) translation algorithms. However, they often require a larger number of neural function evaluations (NFEs), limiting their practical applicability. In this paper, we tackle this problem with Schrodinger Bridges (SBs), which are stochastic differential equations (SDEs) between distributions with minimal transport cost. We analyze the probability flow ordinary differential equation (ODE) formulation of SBs, and observe that we can decompose its vector field into a linear combination of source predictor, target predictor, and noise predictor. Inspired by this observation, we propose Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion, and develop appropriate prompt optimization and change of variables formula to match the training and inference between distributions. We demonstrate that our algorithm successfully conduct competitive I2I translation in unsupervised setting with only a fraction of computation cost required by previous DM-based I2I methods.
翻译:扩散模型既能从噪声生成图像,也能从数据实现反演,这启发了强大的非配对图像到图像转换算法。然而,这些方法通常需要大量的神经网络函数评估次数,限制了其实际应用。本文通过薛定谔桥来解决这一问题,薛定谔桥是分布间具有最小传输成本的随机微分方程。我们分析了薛定谔桥的概率流常微分方程形式,并观察到其向量场可分解为源预测器、目标预测器和噪声预测器的线性组合。受此启发,我们提出了潜在薛定谔桥,通过预训练的Stable Diffusion来近似薛定谔桥常微分方程,并开发了适当的提示优化和变量替换公式以匹配分布间的训练与推理过程。实验表明,我们的算法在无监督设置下能以远低于先前基于扩散模型的图像转换方法所需的计算成本,实现具有竞争力的图像转换效果。