Recently, Text-to-Image (T2I) synthesis technology has made tremendous strides. Numerous representative T2I models have emerged and achieved promising application outcomes, such as DALL-E, Stable Diffusion, Imagen, etc. In practice, it has become increasingly popular for model developers to selectively adopt various pre-trained text encoders and conditional diffusion models from third-party platforms, integrating them to build customized (personalized) T2I models. However, such an adoption approach is vulnerable to backdoor attacks. In this work, we propose a Combinational Backdoor Attack against Customized T2I models (CBACT2I) targeting this application scenario. Different from previous backdoor attacks against T2I models, CBACT2I embeds the backdoor into the text encoder and the conditional diffusion model separately. The customized T2I model exhibits backdoor behaviors only when the backdoor text encoder is used in combination with the backdoor conditional diffusion model. These properties make CBACT2I more stealthy and flexible than prior backdoor attacks against T2I models. Extensive experiments demonstrate the effectiveness of CBACT2I with different backdoor triggers and different backdoor targets on the open-sourced Stable Diffusion model. This work reveals the backdoor vulnerabilities of customized T2I models and urges countermeasures to mitigate backdoor threats in this scenario.
翻译:近年来,文本到图像(T2I)合成技术取得了巨大进展。众多代表性T2I模型相继涌现并取得了良好的应用成果,例如DALL-E、Stable Diffusion、Imagen等。在实践中,模型开发者选择性地采用来自第三方平台的各种预训练文本编码器和条件扩散模型,并将它们集成以构建定制化(个性化)T2I模型的做法日益流行。然而,这种采用方式容易受到后门攻击。在本工作中,我们针对此应用场景提出了一种针对定制化T2I模型的组合式后门攻击(CBACT2I)。与以往针对T2I模型的后门攻击不同,CBACT2I将后门分别嵌入到文本编码器和条件扩散模型中。仅当后门文本编码器与后门条件扩散模型组合使用时,定制化的T2I模型才会表现出后门行为。这些特性使得CBACT2I比先前针对T2I模型的后门攻击更具隐蔽性和灵活性。大量实验证明了CBACT2I在开源Stable Diffusion模型上,使用不同后门触发器和不同后门目标的有效性。本工作揭示了定制化T2I模型的后门漏洞,并敦促采取相应对策以减轻此场景下的后门威胁。