How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?

Custom diffusion models (CDMs) have attracted widespread attention due to their astonishing generative ability for personalized concepts. However, most existing CDMs unreasonably assume that personalized concepts are fixed and cannot change over time. Moreover, they heavily suffer from catastrophic forgetting and concept neglect on old personalized concepts when continually learning a series of new concepts. To address these challenges, we propose a novel Concept-Incremental text-to-image Diffusion Model (CIDM), which can resolve catastrophic forgetting and concept neglect to learn new customization tasks in a concept-incremental manner. Specifically, to surmount the catastrophic forgetting of old concepts, we develop a concept consolidation loss and an elastic weight aggregation module. They can explore task-specific and task-shared knowledge during training, and aggregate all low-rank weights of old concepts based on their contributions during inference. Moreover, in order to address concept neglect, we devise a context-controllable synthesis strategy that leverages expressive region features and noise estimation to control the contexts of generated images according to user conditions. Experiments validate that our CIDM surpasses existing custom diffusion models. The source codes are available at https://github.com/JiahuaDong/CIFC.

翻译：定制扩散模型因其在个性化概念生成方面的惊人能力而受到广泛关注。然而，现有的大多数定制扩散模型不合理地假设个性化概念是固定的，不会随时间变化。此外，在持续学习一系列新概念时，它们严重遭受灾难性遗忘和对旧个性化概念的概念忽视问题。为应对这些挑战，我们提出了一种新颖的概念增量文本到图像扩散模型，该模型能够以概念增量的方式解决灾难性遗忘和概念忽视问题，从而学习新的定制任务。具体而言，为克服对旧概念的灾难性遗忘，我们开发了一种概念巩固损失函数和一个弹性权重聚合模块。它们能够在训练过程中探索任务特定和任务共享的知识，并在推理过程中根据各旧概念低秩权重的贡献进行聚合。此外，为解决概念忽视问题，我们设计了一种上下文可控合成策略，该策略利用表达性区域特征和噪声估计，根据用户条件控制生成图像的上下文。实验验证表明，我们提出的概念增量文本到图像扩散模型优于现有的定制扩散模型。源代码可在 https://github.com/JiahuaDong/CIFC 获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日