Rapid advancements in multimodal large language models have enabled the creation of hyper-realistic images from textual descriptions. However, these advancements also raise significant concerns about unauthorized use, which hinders their broader distribution. Traditional watermarking methods often require complex integration or degrade image quality. To address these challenges, we introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB). TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models. This approach ensures that each user can directly apply a pre-configured set of parameters to the model without altering the original model parameters or compromising image quality. Additionally, noise and augmentation operations are embedded at the pixel level to further secure and stabilize watermarked images. Extensive experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
翻译:多模态大语言模型的快速发展使得从文本描述生成超逼真图像成为可能。然而,这些进展也引发了关于未经授权使用的严重担忧,阻碍了其更广泛的应用。传统水印方法通常需要复杂集成或会降低图像质量。为应对这些挑战,我们提出了一种新颖框架——基于水印信息融合的潜在扩散模型有效用户溯源方法(TEAWIB)。TEAWIB采用独特的即用型配置方案,可将用户特定水印无缝集成到生成模型中。该方法确保每个用户可直接将预配置参数集应用于模型,而无需修改原始模型参数或损害图像质量。此外,在像素级别嵌入了噪声与增强操作,以进一步提升水印图像的安全性与稳定性。大量实验验证了TEAWIB的有效性,其在感知质量与溯源准确性方面均展现出最先进的性能。