Diffusion-based video super-resolution (VSR) has recently achieved remarkable fidelity but still suffers from prohibitive sampling costs. While distribution matching distillation (DMD) can accelerate diffusion models toward one-step generation, directly applying it to VSR often results in training instability alongside degraded and insufficient supervision. To address these issues, we propose DUO-VSR, a three-stage framework built upon a Dual-Stream Distillation strategy that unifies distribution matching and adversarial supervision for one-step VSR. Firstly, a Progressive Guided Distillation Initialization is employed to stabilize subsequent training through trajectory-preserving distillation. Next, the Dual-Stream Distillation jointly optimizes the DMD and Real-Fake Score Feature GAN (RFS-GAN) streams, with the latter providing complementary adversarial supervision leveraging discriminative features from both real and fake score models. Finally, a Preference-Guided Refinement stage further aligns the student with perceptual quality preferences. Extensive experiments demonstrate that DUO-VSR achieves superior visual quality and efficiency over previous one-step VSR approaches.
翻译:基于扩散模型的视频超分辨率(VSR)近期在保真度上取得了显著提升,但仍受限于高昂的采样成本。尽管分布匹配蒸馏(DMD)能加速扩散模型实现单步生成,但直接将其应用于VSR往往会导致训练不稳定以及监督退化和不足的问题。针对这些挑战,我们提出DUO-VSR——一个基于双流蒸馏策略的三阶段框架,该策略统一了分布匹配与对抗性监督以实现一步VSR。首先,采用渐进式引导蒸馏初始化,通过保持轨迹的蒸馏来稳定后续训练。其次,双流蒸馏联合优化DMD流与真实-伪评分特征生成对抗网络(RFS-GAN)流,其中后者利用真实与伪评分模型的判别特征提供互补的对抗性监督。最后,引入偏好引导精炼阶段,进一步将学生模型与感知质量偏好对齐。大量实验表明,DUO-VSR在视觉质量和效率上均优于此前的一步VSR方法。