Exploring Diffusion Models' Corruption Stage in Few-Shot Fine-tuning and Mitigating with Bayesian Neural Networks

Few-shot fine-tuning of Diffusion Models (DMs) is a key advancement, significantly reducing training costs and enabling personalized AI applications. However, we explore the training dynamics of DMs and observe an unanticipated phenomenon: during the training process, image fidelity initially improves, then unexpectedly deteriorates with the emergence of noisy patterns, only to recover later with severe overfitting. We term the stage with generated noisy patterns as corruption stage. To understand this corruption stage, we begin by theoretically modeling the one-shot fine-tuning scenario, and then extend this modeling to more general cases. Through this modeling, we identify the primary cause of this corruption stage: a narrowed learning distribution inherent in the nature of few-shot fine-tuning. To tackle this, we apply Bayesian Neural Networks (BNNs) on DMs with variational inference to implicitly broaden the learned distribution, and present that the learning target of the BNNs can be naturally regarded as an expectation of the diffusion loss and a further regularization with the pretrained DMs. This approach is highly compatible with current few-shot fine-tuning methods in DMs and does not introduce any extra inference costs. Experimental results demonstrate that our method significantly mitigates corruption, and improves the fidelity, quality and diversity of the generated images in both object-driven and subject-driven generation tasks.

翻译：扩散模型（DMs）的少样本微调是一项关键进展，它能显著降低训练成本并支持个性化AI应用。然而，我们探究了DMs的训练动态，观察到一个未预料到的现象：在训练过程中，图像保真度最初提升，随后却意外恶化并出现噪声模式，直到后期才恢复，但伴有严重的过拟合。我们将生成噪声模式的阶段称为腐化阶段。为理解此腐化阶段，我们首先对单样本微调场景进行理论建模，随后将该建模扩展至更一般的情形。通过此建模，我们确定了腐化阶段的主要原因：少样本微调本身固有的学习分布收窄。为解决此问题，我们在DMs上应用变分推断的贝叶斯神经网络（BNNs），以隐式拓宽学习分布，并指出BNNs的学习目标可自然地视为扩散损失的期望及对预训练DMs的进一步正则化。该方法与当前DMs中的少样本微调方法高度兼容，且不引入任何额外推理成本。实验结果表明，我们的方法显著缓解了腐化，并在物体驱动和主体驱动生成任务中提升了生成图像的保真度、质量与多样性。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日