Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models (DPMs) with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. Since the sampling procedure of DPMs involves recursive calls to the denoising UNet, na\"ive gradient backpropagation requires storing the intermediate states of all iterations, resulting in extremely high memory consumption. To overcome this issue, we propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters (including conditioning signals, network weights, and initial noises) by solving another augmented ODE. To reduce numerical errors in both the forward generation and gradient backpropagation processes, we further reparameterize the probability-flow ODE and augmented ODE as simple non-stiff ODEs using exponential integration. Finally, we demonstrate the effectiveness of AdjointDPM on three interesting tasks: converting visual effects into identification text embeddings, finetuning DPMs for specific types of stylization, and optimizing initial noise to generate adversarial samples for security auditing.
翻译:现有定制方法需访问多个参考示例,以使预训练的扩散概率模型与用户提供的概念对齐。本文旨在解决监督信号仅为生成内容上定义的可微度量时扩散概率模型的定制挑战。由于扩散模型的采样过程涉及对去噪U型网络的递归调用,朴素梯度反向传播需存储所有迭代的中间状态,导致极高内存消耗。为克服此问题,我们提出新方法AdjointDPM,其首先通过求解对应的概率流常微分方程从扩散模型生成新样本,随后利用伴随灵敏度方法通过求解另一个增广常微分方程将损失梯度反向传播至模型参数(包括条件信号、网络权重和初始噪声)。为降低前向生成与梯度反向传播过程中的数值误差,我们进一步利用指数积分将概率流常微分方程和增广常微分方程重参数化为简单非刚性常微分方程。最后,我们通过三个有趣任务验证AdjointDPM的有效性:将视觉特效转换为识别文本嵌入、微调扩散模型实现特定风格化、以及优化初始噪声生成对抗样本用于安全审计。