Diffusion models have achieved great success in image generation tasks through iterative noise estimation. However, the heavy denoising process and complex neural networks hinder their low-latency applications in real-world scenarios. Quantization can effectively reduce model complexity, and post-training quantization (PTQ), which does not require fine-tuning, is highly promising in accelerating the denoising process. Unfortunately, we find that due to the highly dynamic distribution of activations in different denoising steps, existing PTQ methods for diffusion models suffer from distribution mismatch issues at both calibration sample level and reconstruction output level, which makes the performance far from satisfactory, especially in low-bit cases. In this paper, we propose Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models (EDA-DM) to address the above issues. Specifically, at the calibration sample level, we select calibration samples based on the density and diversity in the latent space, thus facilitating the alignment of their distribution with the overall samples; and at the reconstruction output level, we propose Fine-grained Block Reconstruction, which can align the outputs of the quantized model and the full-precision model at different network granularity. Extensive experiments demonstrate that EDA-DM outperforms the existing post-training quantization frameworks in both unconditional and conditional generation scenarios. At low-bit precision, the quantized models with our method even outperform the full-precision models on most datasets.
翻译:扩散模型通过迭代噪声估计在图像生成任务中取得了巨大成功。然而,繁重的去噪过程和复杂的神经网络阻碍了其在现实场景中的低延迟应用。量化可以有效降低模型复杂度,而无需微调的后训练量化(PTQ)在加速去噪过程中展现出巨大潜力。不幸的是,我们发现由于不同去噪步骤中激活值的高度动态分布,现有针对扩散模型的PTQ方法在校准样本层面和重建输出层面均存在分布失配问题,导致性能远非理想,尤其在低位宽情况下。本文提出扩散模型后训练量化的增强分布对齐方法(EDA-DM)来解决上述问题。具体而言,在校准样本层面,我们基于潜在空间中的密度和多样性选取校准样本,从而促进其分布与整体样本的对齐;在重建输出层面,我们提出细粒度块重建方法,能够在不同网络粒度上对齐量化模型与全精度模型的输出。大量实验表明,EDA-DM在无条件生成和条件生成场景中均优于现有后训练量化框架。在低位宽精度下,采用我们方法的量化模型甚至能在大多数数据集上超越全精度模型。