LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Latent Consistency Models (LCMs) have achieved impressive performance in accelerating text-to-image generative tasks, producing high-quality images with minimal inference steps. LCMs are distilled from pre-trained latent diffusion models (LDMs), requiring only ~32 A100 GPU training hours. This report further extends LCMs' potential in two aspects: First, by applying LoRA distillation to Stable-Diffusion models including SD-V1.5, SSD-1B, and SDXL, we have expanded LCM's scope to larger models with significantly less memory consumption, achieving superior image generation quality. Second, we identify the LoRA parameters obtained through LCM distillation as a universal Stable-Diffusion acceleration module, named LCM-LoRA. LCM-LoRA can be directly plugged into various Stable-Diffusion fine-tuned models or LoRAs without training, thus representing a universally applicable accelerator for diverse image generation tasks. Compared with previous numerical PF-ODE solvers such as DDIM, DPM-Solver, LCM-LoRA can be viewed as a plug-in neural PF-ODE solver that possesses strong generalization abilities. Project page: https://github.com/luosiallen/latent-consistency-model.

翻译：潜一致性模型（LCMs）在加速文本到图像生成任务中取得了令人瞩目的性能，能够以极少的推理步骤生成高质量图像。LCMs通过从预训练的潜扩散模型（LDMs）中蒸馏得到，仅需约32个A100 GPU小时的训练时长。本报告进一步从两个方面拓展了LCMs的潜力：第一，通过对Stable-Diffusion模型（包括SD-V1.5、SSD-1B和SDXL）应用LoRA蒸馏技术，我们将LCM的应用范围扩展到更大规模的模型，同时显著降低内存消耗，实现了更优的图像生成质量。第二，我们发现通过LCM蒸馏获得的LoRA参数可作为一种通用的Stable-Diffusion加速模块，命名为LCM-LoRA。LCM-LoRA可直接插入多种经过微调的Stable-Diffusion模型或LoRA中而无需额外训练，从而成为适用于各类图像生成任务的通用加速器。与先前基于数值方法的PF-ODE求解器（如DDIM、DPM-Solver）相比，LCM-LoRA可视为一种具有强大泛化能力的即插式神经PF-ODE求解器。项目页面：https://github.com/luosiallen/latent-consistency-model。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日