Diffusion Transformers (DiT) have demonstrated remarkable generative capabilities but remain highly computationally expensive. Previous acceleration methods, such as pruning and distillation, typically rely on a fixed computational capacity, leading to insufficient acceleration and degraded generation quality. To address this limitation, we propose \textbf{Elastic Diffusion Transformer (E-DiT)}, an adaptive acceleration framework for DiT that effectively improves efficiency while maintaining generation quality. Specifically, we observe that the generative process of DiT exhibits substantial sparsity (i.e., some computations can be skipped with minimal impact on quality), and this sparsity varies significantly across samples. Motivated by this observation, E-DiT equips each DiT block with a lightweight router that dynamically identifies sample-dependent sparsity from the input latent. Each router adaptively determines whether the corresponding block can be skipped. If the block is not skipped, the router then predicts the optimal MLP width reduction ratio within the block. During inference, we further introduce a block-level feature caching mechanism that leverages router predictions to eliminate redundant computations in a training-free manner. Extensive experiments across 2D image (Qwen-Image and FLUX) and 3D asset (Hunyuan3D-3.0) demonstrate the effectiveness of E-DiT, achieving up to $\sim$2$\times$ speedup with negligible loss in generation quality. Code will be available at https://github.com/wangjiangshan0725/Elastic-DiT.
翻译:扩散Transformer(DiT)已展现出卓越的生成能力,但其计算成本仍然极高。先前的加速方法,如剪枝与蒸馏,通常依赖于固定的计算容量,导致加速不足且生成质量下降。为应对这一局限,我们提出了**弹性扩散Transformer(E-DiT)**,一种用于DiT的自适应加速框架,能在保持生成质量的同时有效提升效率。具体而言,我们观察到DiT的生成过程表现出显著的稀疏性(即部分计算可在对质量影响最小的情况下被跳过),且这种稀疏性在不同样本间差异显著。受此观察启发,E-DiT为每个DiT模块配备了一个轻量级路由器,该路由器能从输入隐变量中动态识别出样本相关的稀疏性。每个路由器自适应地决定对应模块是否可被跳过。若模块未被跳过,路由器则进一步预测模块内MLP宽度的最优缩减比例。在推理过程中,我们进一步引入了模块级特征缓存机制,该机制利用路由器的预测结果,以无需训练的方式消除冗余计算。在2D图像(Qwen-Image与FLUX)和3D资产(Hunyuan3D-3.0)上的大量实验验证了E-DiT的有效性,实现了高达约2倍的加速,且生成质量损失可忽略不计。代码将发布于 https://github.com/wangjiangshan0725/Elastic-DiT。