We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time. The individual models can be trained in isolation, at different times, and on different distributions and domains and can be later composed to achieve performance comparable to a paragon model trained on all data simultaneously. Furthermore, each model only contains information about the subset of the data it was exposed to during training, enabling several forms of training data protection. In particular, CDMs are the first method to enable both selective forgetting and continual learning for large-scale diffusion models, as well as allowing serving customized models based on the user's access rights. CDMs also allow determining the importance of a subset of the data in generating particular samples.
翻译:我们提出了分区扩散模型(CDM),这是一种在不同数据源上训练不同扩散模型(或提示词),并在推理时任意组合它们的方法。各模型可独立训练,在不同时间、不同分布和领域上完成训练,随后通过组合达到与同时在所有数据上训练的基准模型相当的性能。此外,每个模型仅包含其训练过程中所接触数据子集的信息,从而实现了多种形式的训练数据保护。特别地,CDM是首个能够对大规模扩散模型同时实现选择性遗忘和持续学习的方法,同时支持根据用户访问权限提供定制化模型。CDM还能确定特定数据子集在生成特定样本时的重要性。