Recent advancements in generative AI, particularly Latent Diffusion Models (LDMs), have revolutionized image synthesis and manipulation. However, these generative techniques raises concerns about data misappropriation and intellectual property infringement. Adversarial attacks on machine learning models have been extensively studied, and a well-established body of research has extended these techniques as a benign metric to prevent the underlying misuse of generative AI. Current approaches to safeguarding images from manipulation by LDMs are limited by their reliance on model-specific knowledge and their inability to significantly degrade semantic quality of generated images. In response to these shortcomings, we propose the Posterior Collapse Attack (PCA) based on the observation that VAEs suffer from posterior collapse during training. Our method minimizes dependence on the white-box information of target models to get rid of the implicit reliance on model-specific knowledge. By accessing merely a small amount of LDM parameters, in specific merely the VAE encoder of LDMs, our method causes a substantial semantic collapse in generation quality, particularly in perceptual consistency, and demonstrates strong transferability across various model architectures. Experimental results show that PCA achieves superior perturbation effects on image generation of LDMs with lower runtime and VRAM. Our method outperforms existing techniques, offering a more robust and generalizable solution that is helpful in alleviating the socio-technical challenges posed by the rapidly evolving landscape of generative AI.
翻译:生成式人工智能的最新进展,特别是潜在扩散模型(LDMs),已经彻底改变了图像合成与操控技术。然而,这些生成技术也引发了关于数据盗用和知识产权侵权的担忧。针对机器学习模型的对抗性攻击已被广泛研究,大量成熟的研究已将这些技术扩展为一种良性的度量标准,以防止生成式人工智能的潜在滥用。当前保护图像免受LDMs操控的方法存在局限性,主要体现在其依赖于模型特定知识,并且无法显著降低生成图像的语义质量。针对这些不足,我们基于观察到变分自编码器(VAEs)在训练过程中易发生后验坍缩的现象,提出了后验坍缩攻击(PCA)。我们的方法最大限度地减少了对目标模型白盒信息的依赖,从而摆脱了对模型特定知识的隐含依赖。通过仅访问少量LDM参数(具体而言,仅需LDMs的VAE编码器),我们的方法能在生成质量上引发显著的语义坍缩,尤其是在感知一致性方面,并展现出跨多种模型架构的强大可迁移性。实验结果表明,PCA能以更低的运行时间和显存占用,对LDMs的图像生成实现更优的扰动效果。我们的方法优于现有技术,提供了一种更鲁棒且可泛化的解决方案,有助于缓解快速发展的生成式人工智能领域所带来的社会技术挑战。