Recent advances in deep learning research have shown remarkable achievements across many tasks in computer vision (CV) and natural language processing (NLP). At the intersection of CV and NLP is the problem of image captioning, where the related models' robustness against adversarial attacks has not been well studied. This paper presents a novel adversarial attack strategy, AICAttack (Attention-based Image Captioning Attack), designed to attack image captioning models through subtle perturbations on images. Operating within a black-box attack scenario, our algorithm requires no access to the target model's architecture, parameters, or gradient information. We introduce an attention-based candidate selection mechanism that identifies the optimal pixels to attack, followed by a customised differential evolution method to optimise the perturbations of pixels' RGB values. We demonstrate AICAttack's effectiveness through extensive experiments on benchmark datasets against multiple victim models. The experimental results demonstrate that our method outperforms current leading-edge techniques by achieving consistently higher attack success rates.
翻译:深度学习研究的最新进展在计算机视觉(CV)和自然语言处理(NLP)的诸多任务中取得了显著成就。图像描述问题位于CV与NLP的交叉领域,而相关模型针对对抗攻击的鲁棒性尚未得到充分研究。本文提出了一种新颖的对抗攻击策略——AICAttack(基于注意力的图像描述攻击),旨在通过对图像的细微扰动来攻击图像描述模型。我们的算法在黑盒攻击场景下运行,无需获取目标模型的架构、参数或梯度信息。我们引入了一种基于注意力的候选选择机制来识别最优攻击像素,随后采用定制化的差分进化方法优化像素RGB值的扰动。通过在基准数据集上针对多个受害模型进行大量实验,我们验证了AICAttack的有效性。实验结果表明,我们的方法通过持续获得更高的攻击成功率,性能优于当前领先技术。