With the advance of text-to-image (T2I) diffusion models (e.g., Stable Diffusion) and corresponding personalization techniques such as DreamBooth and LoRA, everyone can manifest their imagination into high-quality images at an affordable cost. However, adding motion dynamics to existing high-quality personalized T2Is and enabling them to generate animations remains an open challenge. In this paper, we present AnimateDiff, a practical framework for animating personalized T2I models without requiring model-specific tuning. At the core of our framework is a plug-and-play motion module that can be trained once and seamlessly integrated into any personalized T2Is originating from the same base T2I. Through our proposed training strategy, the motion module effectively learns transferable motion priors from real-world videos. Once trained, the motion module can be inserted into a personalized T2I model to form a personalized animation generator. We further propose MotionLoRA, a lightweight fine-tuning technique for AnimateDiff that enables a pre-trained motion module to adapt to new motion patterns, such as different shot types, at a low training and data collection cost. We evaluate AnimateDiff and MotionLoRA on several public representative personalized T2I models collected from the community. The results demonstrate that our approaches help these models generate temporally smooth animation clips while preserving the visual quality and motion diversity. Codes and pre-trained weights are available at https://github.com/guoyww/AnimateDiff.
翻译:随着文生图(T2I)扩散模型(如Stable Diffusion)以及DreamBooth、LoRA等个性化技术的进步,每个人都能以可承受的成本将想象力转化为高质量图像。然而,为现有的高质量个性化T2I模型添加运动动态并使其生成动画仍是一个开放挑战。本文提出AnimateDiff,一种无需模型特定调优即可实现个性化T2I模型动画化的实用框架。该框架的核心是一个即插即用的运动模块,该模块仅需训练一次即可无缝集成到源自同一基础T2I模型的任意个性化T2I模型中。通过我们提出的训练策略,运动模块能从真实世界视频中有效学习可迁移的运动先验。训练完成后,该运动模块可插入个性化T2I模型,形成个性化动画生成器。我们进一步提出MotionLoRA——一种针对AnimateDiff的轻量级微调技术,使预训练的运动模块能够以较低的训练和数据收集成本适应新运动模式(如不同镜头类型)。我们在社区收集的多个具有代表性的公开个性化T2I模型上评估了AnimateDiff和MotionLoRA。结果表明,我们的方法能帮助这些模型在保持视觉质量和运动多样性的同时,生成时序平滑的动画片段。代码与预训练权重已开源至https://github.com/guoyww/AnimateDiff。