I See Dead People: Gray-Box Adversarial Attack on Image-To-Text Models

Modern image-to-text systems typically adopt the encoder-decoder framework, which comprises two main components: an image encoder, responsible for extracting image features, and a transformer-based decoder, used for generating captions. Taking inspiration from the analysis of neural networks' robustness against adversarial perturbations, we propose a novel gray-box algorithm for creating adversarial examples in image-to-text models. Unlike image classification tasks that have a finite set of class labels, finding visually similar adversarial examples in an image-to-text task poses greater challenges because the captioning system allows for a virtually infinite space of possible captions. In this paper, we present a gray-box adversarial attack on image-to-text, both untargeted and targeted. We formulate the process of discovering adversarial perturbations as an optimization problem that uses only the image-encoder component, meaning the proposed attack is language-model agnostic. Through experiments conducted on the ViT-GPT2 model, which is the most-used image-to-text model in Hugging Face, and the Flickr30k dataset, we demonstrate that our proposed attack successfully generates visually similar adversarial examples, both with untargeted and targeted captions. Notably, our attack operates in a gray-box manner, requiring no knowledge about the decoder module. We also show that our attacks fool the popular open-source platform Hugging Face.

翻译：現代圖像到文本系統通常採用編碼器-解碼器框架，該框架包含兩個主要組件：用於提取圖像特徵的圖像編碼器，以及用於生成描述的基於Transformer的解碼器。受神經網絡對抗擾動魯棒性分析的啟發，我們提出了一種新穎的灰盒算法，用於在圖像到文本模型中生成對抗樣本。與具有有限類別標籤集的圖像分類任務不同，在圖像到文本任務中尋找視覺相似的對抗樣本面臨更大的挑戰，因為描述系統允許幾乎無限的描述空間。在本文中，我們提出了一種針對圖像到文本的灰盒對抗攻擊，包括非目標攻擊和目標攻擊。我們將發現對抗擾動的過程建模為一個僅使用圖像編碼器組件的優化問題，這意味著所提出的攻擊與語言模型無關。通過在Hugging Face中最常用的圖像到文本模型ViT-GPT2以及Flickr30k數據集上進行的實驗，我們證明了所提出的攻擊成功生成了視覺相似的對抗樣本，包括非目標和目標描述。值得注意的是，我們的攻擊以灰盒方式進行，無需了解解碼器模塊的相關知識。我們還展示了我們的攻擊能夠欺騙流行的開源平台Hugging Face。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日