Diffusion-based generative models have demonstrated exceptional performance, yet their iterative sampling procedures remain computationally expensive. A prominent strategy to mitigate this cost is distillation, with offline distillation offering particular advantages in terms of efficiency, modularity, and flexibility. In this work, we identify two key observations that motivate a principled distillation framework: (1) while diffusion models have been viewed through the lens of dynamical systems theory, powerful and underexplored tools can be further leveraged; and (2) diffusion models inherently impose structured, semantically coherent trajectories in latent space. Building on these observations, we introduce the Koopman Distillation Model (KDM), a novel offline distillation approach grounded in Koopman theory - a classical framework for representing nonlinear dynamics linearly in a transformed space. KDM encodes noisy inputs into an embedded space where a learned linear operator propagates them forward, followed by a decoder that reconstructs clean samples. This enables single-step generation while preserving semantic fidelity. We provide theoretical justification for our approach: (1) under mild assumptions, the learned diffusion dynamics admit a finite-dimensional Koopman representation; and (2) proximity in the Koopman latent space correlates with semantic similarity in the generated outputs, allowing for effective trajectory alignment. KDM achieves highly competitive performance across standard offline distillation benchmarks.
翻译:基于扩散的生成模型展现出卓越的性能,但其迭代采样过程仍存在计算成本高昂的问题。蒸馏是降低这一成本的重要策略,其中离线蒸馏在效率、模块化和灵活性方面具有独特优势。本工作中,我们提出两个关键观察以构建理论化的蒸馏框架:(1)尽管扩散模型已被置于动力系统理论的视角下研究,仍有强大且尚未充分探索的工具可进一步利用;(2)扩散模型本质上在隐空间内构建了结构化、语义连贯的轨迹。基于这些观察,我们提出Koopman蒸馏模型(KDM)——一种基于Koopman理论的新型离线蒸馏方法。Koopman理论是一种在变换空间中线性表示非线性动力学的经典框架。KDM将含噪声输入编码至嵌入空间,通过习得的线性算子进行前向传播,再由解码器重建干净样本。该方法在保持语义保真度的同时实现了单步生成。我们为该方法提供了理论依据:(1)在温和假设下,习得的扩散动力学允许有限维Koopman表示;(2)Koopman隐空间中的邻近性与生成输出的语义相似性相关,从而实现有效的轨迹对齐。KDM在标准离线蒸馏基准测试中取得了极具竞争力的性能。