Diffusion Models (DMs) achieve state-of-the-art performance in generative tasks, boosting a wave in AI for Art. Despite the success of commercialization, DMs meanwhile provide tools for copyright violations, where infringers benefit from illegally using paintings created by human artists to train DMs and generate novel paintings in a similar style. In this paper, we show that it is possible to create an image $x'$ that is similar to an image $x$ for human vision but unrecognizable for DMs. We build a framework to define and evaluate this adversarial example for diffusion models. Based on the framework, we further propose AdvDM, an algorithm to generate adversarial examples for DMs. By optimizing upon different latent variables sampled from the reverse process of DMs, AdvDM conducts a Monte-Carlo estimation of adversarial examples for DMs. Extensive experiments show that the estimated adversarial examples can effectively hinder DMs from extracting their features. Our method can be a powerful tool for human artists to protect their copyright against infringers with DM-based AI-for-Art applications.
翻译:扩散模型(DMs)在生成任务中取得最优性能,推动了人工智能艺术领域的浪潮。尽管商业化成功,DMs同时也为版权侵犯提供了工具——侵权者利用人类艺术家的绘画作品非法训练DMs,并生成风格相似的仿作。本文证明,存在一种图像$x'$,对人类视觉而言与原始图像$x$相似,但DMs无法识别。我们构建了一个框架来定义和评估针对扩散模型的对抗样本,并在此基础上提出AdvDM算法以生成此类对抗样本。通过优化从DMs反向过程中采样的不同潜变量,AdvDM对DMs的对抗样本进行蒙特卡洛估计。大量实验表明,所估计的对抗样本能有效阻碍DMs提取其特征。该方法可作为人类艺术家对抗基于DM的AI艺术应用侵权者的有力版权保护工具。