Diffusion models (DMs) produce high-quality images, yet their sampling remains costly when adapted to new domains. Distilled DMs are faster but typically remain confined within their teacher's domain. Thus, fast and high-quality generation for novel domains relies on two-stage pipelines: Adapt-then-Distill or Distill-then-Adapt. However, both add design complexity and often degrade quality or diversity. We introduce Uni-DAD, a single-stage pipeline that unifies DM distillation and adaptation. It couples two training signals: (i) a dual-domain distribution-matching distillation (DMD) objective that guides the student toward the distributions of the source teacher and a target teacher, and (ii) a multi-head generative adversarial network (GAN) loss that encourages target realism across multiple feature scales. The source domain distillation preserves diverse source knowledge, while the multi-head GAN stabilizes training and reduces overfitting, especially in few-shot regimes. The inclusion of a target teacher facilitates adaptation to more structurally distant domains. We evaluate Uni-DAD on two comprehensive benchmarks for few-shot image generation (FSIG) and subject-driven personalization (SDP) using diffusion backbones. It delivers better or comparable quality to state-of-the-art (SoTA) adaptation methods even with less than 4 sampling steps, and often surpasses two-stage pipelines in quality and diversity. Code: https://github.com/yaramohamadi/uni-DAD.
翻译:扩散模型(DMs)虽能生成高质量图像,但在适配新领域时采样成本依然高昂。蒸馏后的扩散模型速度更快,但其生成能力通常局限于教师模型的领域。因此,在新领域中实现快速高质量生成依赖于两阶段流程:先适配再蒸馏或先蒸馏再适配。然而,这两种方式均增加了设计复杂性,且往往导致生成质量或多样性下降。本文提出单阶段流程Uni-DAD,将扩散模型蒸馏与适配统一整合。该流程耦合两类训练信号:(i)双域分布匹配蒸馏(DMD)目标,引导学生模型同时服从源教师与目标教师的分布;(ii)多尺度生成对抗网络(GAN)损失,在多个特征尺度上促进目标域的真实感。源域蒸馏保留了丰富的源领域知识,而多尺度生成对抗网络则稳定了训练过程并减少过拟合,尤其在少样本场景下效果显著。引入目标教师机制有助于适配结构差异更大的领域。我们基于扩散骨干网络,在少样本图像生成(FSIG)和主题驱动个性化(SDP)两个综合基准上评估了Uni-DAD。即使采样步数少于4步,该方法仍能提供与最新(SoTA)适配方法相当或更优的生成质量,且在质量与多样性方面通常超越两阶段流程。代码地址:https://github.com/yaramohamadi/uni-DAD。