Training a generative model with limited number of samples is a challenging task. Current methods primarily rely on few-shot model adaption to train the network. However, in scenarios where data is extremely limited (less than 10), the generative network tends to overfit and suffers from content degradation. To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model. Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large, and learn local details of target domain when t is small, leading to an improvement in the capture of content, style and local details. Furthermore, we introduce a novel directional distribution consistency loss that ensures the consistency between the generated and source distributions more efficiently and stably than the prior methods, preventing our model from overfitting. Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation. Theoretical analysis, qualitative and quantitative experiments demonstrate the superiority of our approach in few-shot generative model adaption tasks compared to state-of-the-art methods. The source code is available at: https://github.com/sjtuplayer/few-shot-diffusion.
翻译:使用有限样本训练生成模型是一项具有挑战性的任务。当前方法主要依靠少样本模型自适应来训练网络。然而,在数据极度匮乏(少于10个样本)的场景下,生成网络容易过拟合并出现内容退化。针对这些问题,本文提出一种新型的相位内容融合少样本扩散模型,结合方向分布一致性损失函数,该模型在扩散模型的不同训练阶段聚焦于不同的学习目标。具体而言,我们设计了一种相位训练策略,通过相位内容融合帮助模型在时间步t较大时学习内容和风格信息,在t较小时学习目标域的局部细节,从而提升对内容、风格和局部细节的捕捉能力。此外,我们引入一种新型方向分布一致性损失函数,相比先前方法能更高效、更稳定地确保生成分布与源分布的一致性,从而防止模型过拟合。最后,我们提出跨域结构引导策略,增强了域自适应过程中的结构一致性。理论分析、定性与定量实验表明,在少样本生成模型自适应任务中,我们的方法相较于最先进方法具有显著优势。源代码已公开于:https://github.com/sjtuplayer/few-shot-diffusion。