Perceptual similarity metrics have progressively become more correlated with human judgments on perceptual similarity; however, despite recent advances, the addition of an imperceptible distortion can still compromise these metrics. In our study, we systematically examine the robustness of these metrics to imperceptible adversarial perturbations. Following the two-alternative forced-choice experimental design with two distorted images and one reference image, we perturb the distorted image closer to the reference via an adversarial attack until the metric flips its judgment. We first show that all metrics in our study are susceptible to perturbations generated via common adversarial attacks such as FGSM, PGD, and the One-pixel attack. Next, we attack the widely adopted LPIPS metric using spatial-transformation-based adversarial perturbations (stAdv) in a white-box setting to craft adversarial examples that can effectively transfer to other similarity metrics in a black-box setting. We also combine the spatial attack stAdv with PGD ($\ell_\infty$-bounded) attack to increase transferability and use these adversarial examples to benchmark the robustness of both traditional and recently developed metrics. Our benchmark provides a good starting point for discussion and further research on the robustness of metrics to imperceptible adversarial perturbations.
翻译:感知相似度度量已逐渐与人类对感知相似度的判断更加相关;然而,尽管近期取得了进展,添加不可察觉的失真仍可能损害这些度量。在我们的研究中,我们系统地考察了这些度量对不可察觉的对抗性扰动的鲁棒性。采用两选一强迫选择实验设计,使用两张失真图像和一张参考图像,我们通过对抗性攻击将失真图像扰动得更接近参考图像,直到度量翻转其判断。我们首先表明,我们研究中的所有度量均易受通过常见对抗性攻击(如FGSM、PGD和单像素攻击)生成的扰动的攻击。接下来,我们在白盒设置中使用基于空间变换的对抗性扰动(stAdv)攻击广泛采用的LPIPS度量,以制作能够有效迁移至其他相似度度量(黑盒设置)的对抗样本。我们还将空间攻击stAdv与PGD($\ell_\infty$有界)攻击相结合以提高迁移性,并使用这些对抗样本对传统及近期开发的度量进行鲁棒性基准测试。我们的基准测试为讨论及进一步研究度量对不可察觉的对抗性扰动的鲁棒性提供了良好的起点。