Recursively Trained Diffusion Models: Limiting Collapse Distribution and Spectral Characterization

Recursive training of generative models on their own outputs can lead to model collapse, a compounding drift away from the true data distribution. Existing theoretical works bound finite-round error accumulation in the context of diffusion models, but two questions remain open:~what distribution does the recursion converge to, and how fast? We answer both, isolating a mechanism distinct from imperfect learning: even with perfect score estimation and exact sampling, the early stopping of the reverse diffusion (required for numerical stability) drives a progressive drift away from the data distribution. We prove that this recursion converges geometrically to a unique limiting distribution, which admits a closed-form characterization as an infinite mixture of increasingly Gaussian-smoothed versions of the data distribution. A Hermite spectral decomposition of this limit reveals that recursive training acts as a low-pass filter: higher-order modes, which encode fine non-Gaussian structure, are attenuated much more strongly than coarse modes. This spectral picture motivates annealed truncation schedules that progressively shrink truncation times across retraining rounds; we prove that any schedule converging to $0$ asymptotically eliminates recursive compounding. Finally, we show our idealized characterization is robust: in the presence of discretization and score estimation errors, the learned distribution remains in a Wasserstein-2 ball around the ideal limit, with mode-dependent contraction rates that contract high-order errors faster than low-order ones. We validate the theory on synthetic Gaussian mixtures and CIFAR-10.

翻译：生成模型在其自身输出上的递归训练可能导致模型崩溃，即逐步偏离真实数据分布的复合漂移。现有理论工作限制了扩散模型中有限轮次误差的累积，但仍有两大问题悬而未解：递归收敛于何种分布？收敛速度有多快？我们回答了这两个问题，并隔离出一种与不完美学习截然不同的机制：即便拥有完美分数估计与精确采样，逆扩散过程的早期停止（数值稳定性所必需）仍会驱动数据分布的渐进漂移。我们证明该递归几何收敛至唯一极限分布，该分布可通过数据分布的无限混合形式闭式刻画，其中各成分呈逐步增强的高斯平滑。基于埃尔米特谱分解的极限分析揭示，递归训练相当于一个低通滤波器：编码精细非高斯结构的高阶模态被大幅削弱，而粗粒模态衰减较弱。这一谱图景启发我们提出了退火截断调度策略——在重训练轮次间渐进缩小截断时间；我们证明任何收敛至零的调度均能渐进消除递归复合效应。最后，我们证实了理想化刻画具有鲁棒性：在离散化与分数估计误差存在时，学习到的分布仍保持在理想极限的Wasserstein-2球内，且高阶模态的误差收缩速率快于低阶模态。我们在合成高斯混合模型与CIFAR-10数据集上验证了这一理论。