Recently, many studies utilized adversarial examples (AEs) to raise the cost of malicious image editing and copyright violation powered by latent diffusion models (LDMs). Despite their successes, a few have studied the surrogate model they used to generate AEs. In this paper, from the perspective of adversarial transferability, we investigate how the surrogate model's property influences the performance of AEs for LDMs. Specifically, we view the time-step sampling in the Monte-Carlo-based (MC-based) adversarial attack as selecting surrogate models. We find that the smoothness of surrogate models at different time steps differs, and we substantially improve the performance of the MC-based AEs by selecting smoother surrogate models. In the light of the theoretical framework on adversarial transferability in image classification, we also conduct a theoretical analysis to explain why smooth surrogate models can also boost AEs for LDMs.
翻译:近年来,许多研究利用对抗样本(AEs)来提高潜在扩散模型(LDMs)驱动的恶意图像编辑和版权侵犯的成本。尽管这些方法取得了成功,但少有研究关注生成对抗样本所使用的替代模型。本文从对抗迁移性的角度出发,探讨替代模型特性如何影响针对LDMs的对抗样本性能。具体而言,我们将基于蒙特卡洛(MC-based)对抗攻击中的时间步采样视为替代模型的选择过程。我们发现不同时间步上替代模型的平滑性存在差异,通过选择更平滑的替代模型,显著提升了基于MC对抗样本的性能。基于图像分类中对抗迁移性的理论框架,我们还展开了理论分析,以解释平滑替代模型为何能增强针对LDMs的对抗样本表现。