This technical report introduces our top-ranked solution that employs two approaches, \ie suffix injection and projected gradient descent (PGD) , to address the TiFA workshop MLLM attack challenge. Specifically, we first append the text from an incorrectly labeled option (pseudo-labeled) to the original query as a suffix. Using this modified query, our second approach applies the PGD method to add imperceptible perturbations to the image. Combining these two techniques enables successful attacks on the LLaVA 1.5 model.
翻译:本技术报告介绍了我们在TiFA研讨会MLLM攻击挑战中排名第一的解决方案,该方案采用了两种方法,即后缀注入与投影梯度下降。具体而言,我们首先将一个错误标记选项(伪标记)的文本作为后缀附加到原始查询中。利用此修改后的查询,我们的第二种方法应用PGD方法向图像添加难以察觉的扰动。结合这两种技术,我们成功对LLaVA 1.5模型实施了攻击。