SlimDiff: Training-Free, Activation-Guided Hands-free Slimming of Diffusion Models

Diffusion models (DMs), lauded for their generative performance, are computationally prohibitive due to their billion-scale parameters and iterative denoising dynamics. Existing efficiency techniques, such as quantization, timestep reduction, or pruning, offer savings in compute, memory, or runtime but are strictly bottlenecked by reliance on fine-tuning or retraining to recover performance. In this work, we introduce SlimDiff, an automated activation-informed structural compression framework that reduces both attention and feedforward dimensionalities in DMs, while being entirely gradient-free. SlimDiff reframes DM compression as a spectral approximation task, where activation covariances across denoising timesteps define low-rank subspaces that guide dynamic pruning under a fixed compression budget. This activation-aware formulation mitigates error accumulation across timesteps by applying module-wise decompositions over functional weight groups: query--key interactions, value--output couplings, and feedforward projections, rather than isolated matrix factorizations, while adaptively allocating sparsity across modules to respect the non-uniform geometry of diffusion trajectories. SlimDiff achieves up to 35\% acceleration and $\sim$100M parameter reduction over baselines, with generation quality on par with uncompressed models without any backpropagation. Crucially, our approach requires only about 500 calibration samples, over 70$\times$ fewer than prior methods. To our knowledge, this is the first closed-form, activation-guided structural compression of DMs that is entirely training-free, providing both theoretical clarity and practical efficiency.

翻译：扩散模型以其卓越的生成性能而备受赞誉，但其数十亿规模的参数和迭代去噪机制导致计算成本极高。现有的效率优化技术（如量化、时间步缩减或剪枝）虽能在计算、内存或运行时间上实现节约，但严重依赖于微调或重新训练以恢复性能，这构成了严格的瓶颈。本文提出SlimDiff，一种基于激活信息的自动化结构压缩框架，可在完全无需梯度计算的情况下，同时降低扩散模型中注意力机制和前馈网络的维度。SlimDiff将扩散模型压缩重构为谱逼近任务，其中跨去噪时间步的激活协方差定义了低秩子空间，从而在固定压缩预算下指导动态剪枝。这种基于激活的表述通过对功能权重组（查询-键交互、值-输出耦合及前馈投影）进行模块化分解，而非孤立矩阵分解，缓解了误差在时间步间的累积；同时根据扩散轨迹的非均匀几何特性，自适应地在模块间分配稀疏度。SlimDiff在基线模型基础上实现了高达35%的加速和约1亿参数的缩减，且生成质量与未压缩模型相当，全程无需反向传播。关键的是，本方法仅需约500个校准样本，比先前方法减少70倍以上。据我们所知，这是首个完全无需训练、基于激活引导的闭式扩散模型结构压缩方法，兼具理论清晰性与实践高效性。