Reverse sampling and score-distillation have emerged as main workhorses in recent years for image manipulation using latent diffusion models (LDMs). While reverse diffusion sampling often requires adjustments of LDM architecture or feature engineering, score distillation offers a simple yet powerful model-agnostic approach, but it is often prone to mode-collapsing. To address these limitations and leverage the strengths of both approaches, here we introduce a novel framework called {\em DreamSampler}, which seamlessly integrates these two distinct approaches through the lens of regularized latent optimization. Similar to score-distillation, DreamSampler is a model-agnostic approach applicable to any LDM architecture, but it allows both distillation and reverse sampling with additional guidance for image editing and reconstruction. Through experiments involving image editing, SVG reconstruction and etc, we demonstrate the competitive performance of DreamSampler compared to existing approaches, while providing new applications.
翻译:近年来,反向采样和分数蒸馏已成为利用潜在扩散模型(LDM)进行图像操控的主要方法。反向扩散采样通常需要调整LDM架构或特征工程,而分数蒸馏虽提供了一种简单且强大的模型无关方法,却容易产生模式坍塌。为解决这些局限并融合两种方法的优势,我们提出一种名为DreamSampler的新型框架——该框架通过正则化潜在优化的视角,将这两种截然不同的方法无缝整合。与分数蒸馏类似,DreamSampler是一种适用于任意LDM架构的模型无关方法,但它同时支持蒸馏与反向采样,并可附加图像编辑与重建的引导。通过图像编辑、SVG重建等实验,我们验证了DreamSampler相较于现有方法的竞争性表现,同时拓展了新的应用场景。