A Quantitative Approximation Framework for Flow Distillation in Diffusion Models

We develop a quantitative approximation framework for diffusion distillation, viewing few-step sampling as error propagation under compositions of learned flow maps. Focusing on trajectory distillation for the probability-flow ODE, we show that local approximation errors can be strongly amplified in low-noise multimodal regimes, where the underlying dynamics become stiff. In an analytically tractable Gaussian-mixture Ornstein--Uhlenbeck setting, we separate two core difficulties: approximating the time-dependent score field and controlling the dynamical amplification governed by the time-integrated Jacobian bound of the probability-flow ODE. On the approximation side, we prove constructive L^p(p_t) guarantees showing that ReLU--ReQU networks approximate the Gaussian-mixture score uniformly over time, with depth and width scaling polylogarithmically in the target accuracy and explicitly with the mixture geometry. On the stability side, we derive an explicit bound L(t) for the spatial Lipschitz constant of the probability-flow velocity and convert it into a flow map stability estimate governed by \int_s^t L(u)\,du, making late-time amplification in stiff regimes computable. Building on these estimates, we prove that deep residual compositions efficiently approximate the long-horizon transport, with global error controlled by the stability amplification factor, and identify a Lipschitz-mismatch regime in which one-step distillation is structurally unfavorable. The resulting theory yields a stability-balanced non-uniform time grid obtained by uniform partitioning in the cumulative stability coordinate. Experiments support the prediction and reduce end-to-end relative MSE by up to 51.9\% with 8 segments compared with uniform grids.

翻译：我们为扩散蒸馏建立了一个定量逼近框架，将少步采样视为学习流映射复合下的误差传播。针对概率流ODE的轨迹蒸馏，我们证明了局部逼近误差在低噪声多峰模式下会被显著放大——此时底层动力学变得刚性。在解析可处理的高斯混合Ornstein-Uhlenbeck设定中，我们分离出两个核心难题：逼近时变分数场，以及控制受概率流ODE时间积分雅可比界支配的动力学放大。在逼近方面，我们证明了构造性L^p(p_t)保证：ReLU-ReQU网络可一致逼近高斯混合分数，其深度和宽度在目标精度上呈多对数缩放，并显式依赖于混合几何结构。在稳定性方面，我们推导出概率流速度空间Lipschitz常数的显式界L(t)，并将其转化为受\int_s^t L(u)\,du控制的流映射稳定性估计，使得刚性模式下的晚期放大可计算。基于这些估计，我们证明了深度残差复合能有效逼近长程输运过程，全局误差受稳定性放大因子控制，并识别出一种Lipschitz失配机制——此时一步蒸馏在结构上不利。由此产生的理论导出一种稳定性平衡的非均匀时间网格，该网格通过对累积稳定性坐标进行均匀划分获得。实验支持该预测，与均匀网格相比，使用8个分段可将端到端相对MSE最多降低51.9%。