Latent Consistency Model (LCM) extends the Consistency Model to the latent space and leverages the guided consistency distillation technique to achieve impressive performance in accelerating text-to-image synthesis. However, we observed that LCM struggles to generate images with both clarity and detailed intricacy. To address this limitation, we initially delve into and elucidate the underlying causes. Our investigation identifies that the primary issue stems from errors in three distinct areas. Consequently, we introduce Trajectory Consistency Distillation (TCD), which encompasses trajectory consistency function and strategic stochastic sampling. The trajectory consistency function diminishes the distillation errors by broadening the scope of the self-consistency boundary condition and endowing the TCD with the ability to accurately trace the entire trajectory of the Probability Flow ODE. Additionally, strategic stochastic sampling is specifically designed to circumvent the accumulated errors inherent in multi-step consistency sampling, which is meticulously tailored to complement the TCD model. Experiments demonstrate that TCD not only significantly enhances image quality at low NFEs but also yields more detailed results compared to the teacher model at high NFEs.
翻译:潜空间一致性模型(LCM)将一致性模型扩展至潜空间,并利用引导式一致性蒸馏技术,在加速文本到图像合成方面取得了显著性能。然而,我们观察到LCM难以生成既清晰又细节丰富的图像。为克服这一局限,我们首先深入探究并阐明了根本原因。研究发现主要问题源于三个不同领域的误差。为此,我们提出轨迹一致性蒸馏(TCD),该方法包含轨迹一致性函数与策略性随机采样。轨迹一致性函数通过拓宽自一致性边界条件的范围,并赋予TCD精确追踪概率流常微分方程完整轨迹的能力,从而减小蒸馏误差。此外,策略性随机采样专门设计用于规避多步一致性采样中固有的累积误差,并经过精心调整以与TCD模型互补。实验表明,在低NFE条件下,TCD不仅能显著提升图像质量,还能在高NFE条件下比教师模型生成更具细节的图像结果。