Boomerang: Local sampling on image manifolds using diffusion models

The inference stage of diffusion models can be seen as running a reverse-time diffusion stochastic differential equation, where samples from a Gaussian latent distribution are transformed into samples from a target distribution that usually reside on a low-dimensional manifold, e.g., an image manifold. The intermediate values between the initial latent space and the image manifold can be interpreted as noisy images, with the amount of noise determined by the forward diffusion process noise schedule. We utilize this interpretation to present Boomerang, an approach for local sampling of image manifolds. As implied by its name, Boomerang local sampling involves adding noise to an input image, moving it closer to the latent space, and then mapping it back to the image manifold through a partial reverse diffusion process. Thus, Boomerang generates images on the manifold that are ``similar,'' but nonidentical, to the original input image. We can control the proximity of the generated images to the original by adjusting the amount of noise added. Furthermore, due to the stochastic nature of the reverse diffusion process in Boomerang, the generated images display a certain degree of stochasticity, allowing us to obtain local samples from the manifold without encountering any duplicates. Boomerang offers the flexibility to work seamlessly with any pretrained diffusion model, such as Stable Diffusion, without necessitating any adjustments to the reverse diffusion process. We present three applications for Boomerang. First, we provide a framework for constructing privacy-preserving datasets having controllable degrees of anonymity. Second, we show that using Boomerang for data augmentation increases generalization performance and outperforms state-of-the-art synthetic data augmentation. Lastly, we introduce a perceptual image enhancement framework, which enables resolution enhancement.

翻译：摘要：扩散模型的推理阶段可视为执行逆向时间扩散随机微分方程，其中来自高斯潜在分布的样本被转化为通常位于低维流形（如图像流形）上的目标分布样本。初始潜在空间与图像流形之间的中间值可解释为带噪图像，噪声量由前向扩散过程的噪声调度决定。我们利用这一解释提出Boomerang方法——一种用于图像流形局部采样的技术。顾名思义，Boomerang局部采样的过程包括：向输入图像添加噪声使其向潜在空间移动，再通过部分逆向扩散过程将其映射回图像流形。因此，Boomerang能够在流形上生成与原始输入图像“相似”但不完全相同的图像。通过调整添加的噪声量，可控制生成图像与原始图像的邻近程度。此外，由于Boomerang中逆向扩散过程的随机特性，生成图像表现出一定程度的随机性，从而能在避免重复的前提下获取流形上的局部样本。Boomerang可灵活适配任意预训练扩散模型（如Stable Diffusion），且无需对逆向扩散过程进行任何调整。我们展示了Boomerang的三种应用：首先，提出一种构建可控匿名程度的隐私保护数据集框架；其次，证明使用Boomerang进行数据增强可提升泛化性能，并超越最先进的合成数据增强方法；最后，引入一种支持分辨率增强的感知图像增强框架。