Diffusion Models (DMs) have empowered great success in artificial-intelligence-generated content, especially in artwork creation, yet raising new concerns in intellectual properties and copyright. For example, infringers can make profits by imitating non-authorized human-created paintings with DMs. Recent researches suggest that various adversarial examples for diffusion models can be effective tools against these copyright infringements. However, current adversarial examples show weakness in transferability over different painting-imitating methods and robustness under straightforward adversarial defense, for example, noise purification. We surprisingly find that the transferability of adversarial examples can be significantly enhanced by exploiting a fused and modified adversarial loss term under consistent parameters. In this work, we comprehensively evaluate the cross-method transferability of adversarial examples. The experimental observation shows that our method generates more transferable adversarial examples with even stronger robustness against the simple adversarial defense.
翻译:扩散模型在人工智能生成内容中取得了巨大成功,尤其是在艺术品创作领域,但同时也引发了知识产权和版权方面的新担忧。例如,侵权者可以通过扩散模型模仿未经授权的人类创作画作来牟利。近期研究表明,针对扩散模型的多种对抗样本可作为应对此类版权侵权的有效工具。然而,当前对抗样本在不同绘画模仿方法间的可迁移性较弱,且在简单对抗防御(如噪声净化)下的鲁棒性不足。我们意外发现,通过使用统一参数下的融合改进对抗损失项,可以显著增强对抗样本的可迁移性。本研究全面评估了对抗样本的跨方法可迁移性,实验结果表明,我们的方法能够生成更具可迁移性的对抗样本,且对简单对抗防御具有更强的鲁棒性。