MuseumMaker: Continual Style Customization without Catastrophic Forgetting

Pre-trained large text-to-image (T2I) models with an appropriate text prompt has attracted growing interests in customized images generation field. However, catastrophic forgetting issue make it hard to continually synthesize new user-provided styles while retaining the satisfying results amongst learned styles. In this paper, we propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner, and gradually accumulate these creative artistic works as a Museum. When facing with a new customization style, we develop a style distillation loss module to transfer the style of the whole dataset into generation of images. It can minimize the learning biases caused by content of images, and address the catastrophic overfitting issue induced by few-shot images. To deal with catastrophic forgetting amongst past learned styles, we devise a dual regularization for shared-LoRA module to optimize the direction of model update, which could regularize the diffusion model from both weight and feature aspects, respectively. Meanwhile, a unique token embedding corresponding to this new style is learned by a task-wise token learning module, which could preserve historical knowledge from past styles with the limitation of LoRA parameter quantity. As any new user-provided style come, our MuseumMaker can capture the nuances of the new styles while maintaining the details of learned styles. Experimental results on diverse style datasets validate the effectiveness of our proposed MuseumMaker method, showcasing its robustness and versatility across various scenarios.

翻译：预训练大规模文本到图像（T2I）模型配合适当的文本提示，在定制化图像生成领域日益受到关注。然而，灾难性遗忘问题使得模型难以在持续合成用户提供的新风格的同时，保留已学习风格的满意生成效果。本文提出博物馆制造者（MuseumMaker）方法，该方法能以永不终止的方式遵循一组定制风格合成图像，并逐步将这些创意艺术作品积累为一座"博物馆"。面对新的定制风格时，我们开发了风格蒸馏损失模块，将整个数据集的风格迁移到图像生成过程中，从而最小化图像内容导致的学习偏差，并解决小样本引发的灾难性过拟合问题。为处理已学习风格间的灾难性遗忘，我们为共享LoRA模块设计了双重正则化机制，从权重和特征两个层面分别优化模型更新方向，进而约束扩散模型。同时，通过任务式令牌学习模块学习对应新风格的唯一令牌嵌入，在LoRA参数数量受限的情况下保存历史风格知识。当用户提供任意新风格时，我们的博物馆制造者既能捕捉新风格的细微差异，又能维持已学习风格的细节特征。在多种风格数据集上的实验结果验证了所提方法的有效性，展示了其在不同场景下的鲁棒性与通用性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日