X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

We introduce X-Adapter, a universal upgrader to enable the pretrained plug-and-play modules (e.g., ControlNet, LoRA) to work directly with the upgraded text-to-image diffusion model (e.g., SDXL) without further retraining. We achieve this goal by training an additional network to control the frozen upgraded model with the new text-image data pairs. In detail, X-Adapter keeps a frozen copy of the old model to preserve the connectors of different plugins. Additionally, X-Adapter adds trainable mapping layers that bridge the decoders from models of different versions for feature remapping. The remapped features will be used as guidance for the upgraded model. To enhance the guidance ability of X-Adapter, we employ a null-text training strategy for the upgraded model. After training, we also introduce a two-stage denoising strategy to align the initial latents of X-Adapter and the upgraded model. Thanks to our strategies, X-Adapter demonstrates universal compatibility with various plugins and also enables plugins of different versions to work together, thereby expanding the functionalities of diffusion community. To verify the effectiveness of the proposed method, we conduct extensive experiments and the results show that X-Adapter may facilitate wider application in the upgraded foundational diffusion model.

翻译：我们提出了X-Adapter，一种通用升级器，使得预训练的即插即用模块（例如ControlNet、LoRA）能够直接与升级后的文本到图像扩散模型（例如SDXL）协同工作，而无需进一步重新训练。我们通过训练一个附加网络，利用新的文本-图像数据对来控制冻结的升级模型，从而实现这一目标。具体而言，X-Adapter保留了一个旧模型的冻结副本，以维持不同插件的连接器。此外，X-Adapter添加了可训练的映射层，这些层桥接了不同版本模型的解码器，用于特征重映射。重映射后的特征将作为升级模型的引导。为了增强X-Adapter的引导能力，我们对升级模型采用了空文本训练策略。训练完成后，我们还引入了一种两阶段去噪策略，以对齐X-Adapter和升级模型的初始潜在表示。得益于我们的策略，X-Adapter展示了与各种插件的通用兼容性，并且还使得不同版本的插件能够协同工作，从而扩展了扩散社区的功能。为了验证所提出方法的有效性，我们进行了广泛的实验，结果表明X-Adapter可能在升级的基础扩散模型中促进更广泛的应用。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日