Diffusion models have achieved remarkable success in generating high-quality images. Recently, the open-source models represented by Stable Diffusion (SD) are thriving and are accessible for customization, giving rise to a vibrant community of creators and enthusiasts. However, the widespread availability of customized SD models has led to copyright concerns, like unauthorized model distribution and unconsented commercial use. To address it, recent works aim to let SD models output watermarked content for post-hoc forensics. Unfortunately, none of them can achieve the challenging white-box protection, wherein the malicious user can easily remove or replace the watermarking module to fail the subsequent verification. For this, we propose \texttt{\method} as the first implementation under this scenario. Briefly, we merge watermark information into the U-Net of Stable Diffusion Models via a watermark Low-Rank Adaptation (LoRA) module in a two-stage manner. For watermark LoRA module, we devise a scaling matrix to achieve flexible message updates without retraining. To guarantee fidelity, we design Prior Preserving Fine-Tuning (PPFT) to ensure watermark learning with minimal impacts on model distribution, validated by proofs. Finally, we conduct extensive experiments and ablation studies to verify our design.
翻译:扩散模型在高质量图像生成领域取得了显著成功。近期,以稳定扩散(Stable Diffusion,SD)为代表的开源模型蓬勃发展并实现了定制化功能,催生了蓬勃的创作者与爱好者社区。然而,定制化SD模型的广泛可获得性引发了版权问题,例如未经授权的模型分发和未获同意的商业使用。为应对该问题,近期研究致力于使SD模型输出含水印内容以供事后溯源。遗憾的是,现有方法均无法实现具有挑战性的白盒保护——在此场景下,恶意用户可以轻松移除或替换水印模块,导致后续验证失效。为此,我们提出\texttt{\method}作为该场景下的首个实现方案。简言之,我们通过两阶段方式,借助水印低秩适配(Low-Rank Adaptation,LoRA)模块将水印信息嵌入稳定扩散模型的U-Net中。针对水印LoRA模块,我们设计了缩放矩阵以实现无需重新训练的灵活消息更新。为保证保真性,我们提出先验保持微调(Prior Preserving Fine-Tuning,PPFT)策略,通过理论证明确保最小化对模型分布影响的水印学习。最后,通过大量实验与消融研究验证了我们的设计。