The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusting the model architecture. This study focuses on reducing redundant computation in SDM and optimizes the model through both tuning and tuning-free methods. 1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation. Second, to mitigate performance loss due to pruning, we incorporate multi-expert conditional convolution (ME-CondConv) into compressed UNets to enhance network performance by increasing capacity without sacrificing speed. Third, we validate the effectiveness of the multi-UNet switching method for improving network speed. 2) For the tuning-free method, we propose a feature inheritance strategy to accelerate inference by skipping local computations at the block, layer, or unit level within the network structure. We also examine multiple sampling modes for feature inheritance at the time-step level. Experiments demonstrate that both the proposed tuning and the tuning-free methods can improve the speed and performance of the SDM. The lightweight model reconstructed by the model assembly strategy increases generation speed by $22.4%$, while the feature inheritance strategy enhances the SDM generation speed by $40.0%$.
翻译:稳定扩散模型(Stable Diffusion Model,SDM)是一种流行且有效的文本到图像(T2I)与图像到图像(I2I)生成模型。尽管已有多种采样器优化、模型蒸馏及网络量化方面的尝试,但这些方法通常保持原始网络架构不变。庞大的参数量级与巨大的计算需求限制了针对模型架构调整的研究。本研究聚焦于减少SDM中的冗余计算,并通过调优与非调优两种方法对模型进行优化。1)在调优方法方面,我们设计了一种模型组装策略,通过蒸馏在保持性能的同时重构轻量化模型。其次,为减轻剪枝带来的性能损失,我们将多专家条件卷积(ME-CondConv)引入压缩后的UNet中,通过在不牺牲速度的前提下增加网络容量来提升性能。第三,我们验证了多UNet切换方法对于提升网络速度的有效性。2)在非调优方法方面,我们提出了一种特征继承策略,通过在网络结构中的块级、层级或单元级跳过局部计算来加速推理。我们还研究了在时间步级别进行特征继承的多种采样模式。实验表明,所提出的调优与非调优方法均能提升SDM的速度与性能。通过模型组装策略重构的轻量化模型将生成速度提升了$22.4%$,而特征继承策略则将SDM的生成速度提升了$40.0%$。