TrojanEdit: Backdooring Text-Based Image Editing Models

As diffusion models have achieved success in image generation tasks, many studies have extended them to other related fields like image editing. Unlike image generation, image editing aims to modify an image based on user requests while keeping other parts of the image unchanged. Among these, text-based image editing is the most representative task.Some studies have shown that diffusion models are vulnerable to backdoor attacks, where attackers may poison the training data to inject the backdoor into models. However, previous backdoor attacks on diffusion models primarily focus on image generation models without considering image editing models. Given that image editing models accept multimodal inputs, it raises a new question regarding the effectiveness of different modalities triggers in backdoor attacks on these models. To address this question, we propose a backdoor attack framework for image editing models, named TrojanEdit, which can handle different modalities triggers. We explore five types of visual triggers, three types of textual triggers, and combine them together as fifteen types of multimodal triggers, conducting extensive experiments for three types of backdoor attack goals. Our experimental results show that the image editing model has a backdoor bias for texture triggers. Compared to visual triggers, textual triggers have stronger attack effectiveness but also cause more damage to the model's normal functionality. Furthermore, we found that multimodal triggers can achieve a good balance between the attack effectiveness and model's normal functionality.

翻译：随着扩散模型在图像生成任务中取得成功，许多研究将其扩展到图像编辑等其他相关领域。与图像生成不同，图像编辑旨在根据用户请求修改图像，同时保持图像其他部分不变。其中，基于文本的图像编辑是最具代表性的任务。已有研究表明扩散模型易受后门攻击，攻击者可能通过污染训练数据将后门注入模型。然而，以往对扩散模型的后门攻击主要集中于图像生成模型，未考虑图像编辑模型。鉴于图像编辑模型接受多模态输入，这引发了一个新问题：在多模态触发器的后门攻击中，不同模态触发器的有效性如何？为解答此问题，我们提出了一种针对图像编辑模型的后门攻击框架，命名为TrojanEdit，该框架能够处理不同模态的触发器。我们探索了五种视觉触发器、三种文本触发器，并将它们组合为十五种多模态触发器，针对三类后门攻击目标进行了广泛实验。实验结果表明，图像编辑模型对纹理触发器存在后门偏好。与视觉触发器相比，文本触发器具有更强的攻击效果，但也会对模型的正常功能造成更大损害。此外，我们发现多模态触发器能在攻击效果与模型正常功能之间实现良好平衡。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日