Analytical diffusion models offer a mathematically transparent path to generative modeling by formulating the denoising score as an empirical-Bayes posterior mean. However, this interpretability comes at a prohibitive cost: the standard formulation necessitates a full-dataset scan at every timestep, scaling linearly with dataset size. In this work, we present the first systematic study addressing this scalability bottleneck. We challenge the prevailing assumption that the entire training data is necessary, uncovering the phenomenon of Posterior Progressive Concentration: the effective golden support of the denoising score is not static but shrinks asymptotically from the global manifold to a local neighborhood as the signal-to-noise ratio increases. Capitalizing on this, we propose Dynamic Time-Aware Golden Subset Diffusion (GoldDiff), a training-free framework that decouples inference complexity from dataset size. Instead of static retrieval, GoldDiff uses a coarse-to-fine mechanism to dynamically pinpoint the ''Golden Subset'' for inference. Theoretically, we derive rigorous bounds guaranteeing that our sparse approximation converges to the exact score. Empirically, GoldDiff achieves a $\bf 71 \times$ speedup on AFHQ while matching or achieving even better performance than full-scan baselines. Most notably, we demonstrate the first successful scaling of analytical diffusion to ImageNet-1K, unlocking a scalable, training-free paradigm for large-scale generative modeling.
翻译:解析扩散模型通过将去噪分数构建为经验贝叶斯后验均值,为生成式建模提供了一条数学透明的路径。然而,这种可解释性带来了极高的计算代价:标准实现需要在每个时间步对整个数据集进行扫描,其计算复杂度随数据集大小线性增长。本研究首次系统性地解决了这一可扩展性瓶颈。我们挑战了“整个训练数据均为必需”的主流假设,揭示了后验渐进集中现象:去噪分数的有效黄金支撑集并非静态不变,而是随着信噪比增加,从全局流形渐近收缩至局部邻域。基于此发现,我们提出了动态时间感知黄金子集扩散(GoldDiff),这是一种免训练框架,可将推断复杂度与数据集大小解耦。GoldDiff不采用静态检索,而是通过一种由粗到精的机制动态定位用于推断的“黄金子集”。在理论上,我们推导了严格的界,保证我们的稀疏逼近收敛到精确分数。在实证中,GoldDiff在AFHQ数据集上实现了$\bf 71$倍的加速,同时达到甚至超越了全扫描基线的性能。尤为重要的是,我们首次成功将解析扩散模型扩展到ImageNet-1K,为大规模生成式建模解锁了一种可扩展的、免训练的新范式。