With the rapid adoption of diffusion models for visual content generation, proving authorship and protecting copyright have become critical. This challenge is particularly important when model owners keep their models private and may be unwilling or unable to handle authorship issues, making third-party verification essential. A natural solution is to embed watermarks for later verification. However, existing methods require access to model weights and rely on computationally heavy procedures, rendering them impractical and non-scalable. To address these challenges, we propose NoisePrints, a lightweight watermarking scheme that utilizes the random seed used to initialize the diffusion process as a proof of authorship without modifying the generation process. Our key observation is that the initial noise derived from a seed is highly correlated with the generated visual content. By incorporating a hash function into the noise sampling process, we further ensure that recovering a valid seed from the content is infeasible. We also show that sampling an alternative seed that passes verification is infeasible, and demonstrate the robustness of our method under various manipulations. Finally, we show how to use cryptographic zero-knowledge proofs to prove ownership without revealing the seed. By keeping the seed secret, we increase the difficulty of watermark removal. In our experiments, we validate NoisePrints on multiple state-of-the-art diffusion models for images and videos, demonstrating efficient verification using only the seed and output, without requiring access to model weights.
翻译:随着扩散模型在视觉内容生成中的快速普及,证明作者身份和保护版权变得至关重要。当模型所有者保持模型私有且可能不愿或无法处理作者归属问题时,这一挑战尤为突出,因此第三方验证至关重要。一种自然的解决方案是嵌入水印以供后续验证。然而,现有方法需要访问模型权重并依赖计算量巨大的流程,使其不切实际且缺乏可扩展性。为解决这些问题,我们提出NoisePrints,一种轻量级水印方案,利用初始化扩散过程的随机种子作为作者身份证明,而无需修改生成过程。我们的关键观察是:从种子导出的初始噪声与生成的视觉内容高度相关。通过在噪声采样过程中引入哈希函数,我们进一步确保从内容中恢复有效种子不可行。我们还证明,采样一个能通过验证的替代种子也是不可行的,并展示了我们方法在各种操作下的鲁棒性。最后,我们展示了如何使用密码学零知识证明在不暴露种子的情况下证明所有权。通过保持种子秘密,我们增加了水印移除的难度。在我们的实验中,我们针对多个最先进的图像和视频扩散模型验证了NoisePrints,证明了仅使用种子和输出即可实现高效验证,无需访问模型权重。