Concept-centric Personalization with Large-scale Diffusion Priors

Despite large-scale diffusion models being highly capable of generating diverse open-world content, they still struggle to match the photorealism and fidelity of concept-specific generators. In this work, we present the task of customizing large-scale diffusion priors for specific concepts as concept-centric personalization. Our goal is to generate high-quality concept-centric images while maintaining the versatile controllability inherent to open-world models, enabling applications in diverse tasks such as concept-centric stylization and image translation. To tackle these challenges, we identify catastrophic forgetting of guidance prediction from diffusion priors as the fundamental issue. Consequently, we develop a guidance-decoupled personalization framework specifically designed to address this task. We propose Generalized Classifier-free Guidance (GCFG) as the foundational theory for our framework. This approach extends Classifier-free Guidance (CFG) to accommodate an arbitrary number of guidances, sourced from a variety of conditions and models. Employing GCFG enables us to separate conditional guidance into two distinct components: concept guidance for fidelity and control guidance for controllability. This division makes it feasible to train a specialized model for concept guidance, while ensuring both control and unconditional guidance remain intact. We then present a null-text Concept-centric Diffusion Model as a concept-specific generator to learn concept guidance without the need for text annotations. Code will be available at https://github.com/PRIV-Creation/Concept-centric-Personalization.

翻译：尽管大规模扩散模型在生成多样化的开放世界内容方面表现出色，但在匹配特定概念生成器的逼真度和保真度方面仍存在不足。本文提出了将大规模扩散先验定制为特定概念的任务，即概念为中心的个性化。我们的目标是在保持开放世界模型固有可控性的同时，生成高质量的概念中心图像，从而支持概念中心风格化与图像翻译等多种应用。针对这些挑战，我们识别出扩散先验中引导预测的灾难性遗忘是根本问题。为此，我们开发了一种专用于解决该任务的引导解耦个性化框架。我们提出广义无分类器引导（GCFG）作为该框架的基础理论。该方法将无分类器引导（CFG）扩展为可容纳任意数量的引导，这些引导可来源于多种条件和模型。利用GCFG，我们将条件引导分离为两个独立部分：用于保真度的概念引导和用于可控性的控制引导。这种分离使得训练专门的概念引导模型成为可能，同时确保控制引导和无条件引导保持完整。最终，我们提出一种无需文本标注的零文本概念中心扩散模型作为概念特定生成器，用于学习概念引导。代码将在https://github.com/PRIV-Creation/Concept-centric-Personalization 开源。