Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to benign prompts, induce specific behaviors into the generative process, such as generating images of a particular object or generating high perplexity text.
翻译:提示接口允许用户快速调整视觉和语言生成模型的输出。然而,提示中的微小变化和设计选择可能导致输出出现显著差异。在本文中,我们开发了一个黑盒框架,用于为无结构的图像和文本生成生成对抗提示。这些提示可以独立使用或附加在良性提示之前,能够诱导生成过程产生特定行为,例如生成特定对象的图像或生成高困惑度的文本。