PromptLA: Towards Integrity Verification of Black-box Text-to-Image Diffusion Models

Despite the impressive synthesis quality of text-to-image (T2I) diffusion models, their black-box deployment poses significant regulatory challenges: Malicious actors can fine-tune these models to generate illegal content, circumventing existing safeguards through parameter manipulation. Therefore, it is essential to verify the integrity of T2I diffusion models. To this end, considering the randomness within the outputs of generative models and the high costs in interacting with them, we discern model tampering via the KL divergence between the distributions of the features of generated images. We propose a novel prompt selection algorithm based on learning automaton (PromptLA) for efficient and accurate verification. Evaluations on four advanced T2I models (e.g., SDXL, FLUX.1) demonstrate that our method achieves a mean AUC of over 0.96 in integrity detection, exceeding baselines by more than 0.2, showcasing strong effectiveness and generalization. Additionally, our approach achieves lower cost and is robust against image-level post-processing. To the best of our knowledge, this paper is the first work addressing the integrity verification of T2I diffusion models, which establishes quantifiable standards for AI copyright litigation in practice.

翻译：尽管文生图（T2I）扩散模型展现出令人印象深刻的合成质量，但其黑盒部署模式带来了严峻的监管挑战：恶意行为者可通过参数微调使模型生成非法内容，从而绕过现有安全机制。因此，对T2I扩散模型进行完整性验证至关重要。为此，考虑到生成模型输出的随机性及与其交互的高昂成本，我们通过计算生成图像特征分布之间的KL散度来识别模型篡改。我们提出了一种基于学习自动机的新型提示词选择算法（PromptLA），以实现高效且准确的验证。在四种先进T2I模型（如SDXL、FLUX.1）上的评估表明，我们的方法在完整性检测中取得了超过0.96的平均AUC值，较基线方法提升超过0.2，展现出强大的有效性和泛化能力。此外，我们的方法实现了更低的验证成本，并对图像级后处理操作具有鲁棒性。据我们所知，本文是首个针对T2I扩散模型完整性验证的研究工作，为实践中的人工智能版权诉讼建立了可量化的标准。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日