Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup samples from a point in weight space that approximates the geometric mean of the distributions of constituent datasets, which offers anti-memorization guarantees and enables zero-shot style mixing. Empirically, Diffusion Soup outperforms a paragon model trained on the union of all data shards and achieves a 30% improvement in Image Reward (.34 $\to$ .44) on domain sharded data, and a 59% improvement in IR (.37 $\to$ .59) on aesthetic data. In both cases, souping also prevails in TIFA score (respectively, 85.5 $\to$ 86.5 and 85.6 $\to$ 86.8). We demonstrate robust unlearning -- removing any individual domain shard only lowers performance by 1% in IR (.45 $\to$ .44) -- and validate our theoretical insights on anti-memorization using real data. Finally, we showcase Diffusion Soup's ability to blend the distinct styles of models finetuned on different shards, resulting in the zero-shot generation of hybrid styles.

翻译：本文提出扩散汤（Diffusion Soup），一种用于文本到图像生成的模块化权重平均方法，通过对在分片数据上训练的扩散模型权重进行平均实现。该方法通过构造实现了无需训练即可持续学习与遗忘的能力，且不产生额外的内存或推理成本——只需重新平均即可添加或移除对应数据分片的模型。我们证明扩散汤从权重空间中的一个点进行采样，该点近似于各组成数据集分布的几何平均，从而提供抗记忆化保证并实现零样本风格混合。实验表明，扩散汤在性能上优于在所有数据分片并集上训练的基准模型：在领域分片数据上，图像奖励指标提升30%（0.34 → 0.44）；在美学数据上，图像奖励指标提升59%（0.37 → 0.59）。两种情况下，扩散汤在TIFA分数上也均取得优势（分别为85.5 → 86.5和85.6 → 86.8）。我们展示了稳健的遗忘能力——移除任意单个领域分片仅使图像奖励指标降低1%（0.45 → 0.44），并基于真实数据验证了抗记忆化的理论见解。最后，我们展示了扩散汤融合不同分片上微调模型独特风格的能力，实现了混合风格的零样本生成。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日