Deep neural networks are capable of state-of-the-art performance in many classification tasks. However, they are known to be vulnerable to adversarial attacks -- small perturbations to the input that lead to a change in classification. We address this issue from the perspective of backward error and condition number, concepts that have proved useful in numerical analysis. To do this, we build on the work of Beuzeville et al. (2021). In particular, we develop a new class of attack algorithms that use componentwise relative perturbations. Such attacks are highly relevant in the case of handwritten documents or printed texts where, for example, the classification of signatures, postcodes, dates or numerical quantities may be altered by changing only the ink consistency and not the background. This makes the perturbed images look natural to the naked eye. Such ``adversarial ink'' attacks therefore reveal a weakness that can have a serious impact on safety and security. We illustrate the new attacks on real data and contrast them with existing algorithms. We also study the use of a componentwise condition number to quantify vulnerability.
翻译:深度神经网络在许多分类任务中展现出最先进的性能。然而,它们已知容易受到对抗性攻击——即输入的小幅扰动会导致分类结果的改变。我们从数值分析中证明有用的向后误差和条件数概念出发,探讨这一问题。为此,我们基于Beuzeville等人(2021)的工作展开。具体而言,我们开发了一类使用分量相对扰动的攻击算法。此类攻击在手写文档或印刷文本场景中极具相关性,例如通过仅改变墨水浓度而不改变背景,即可改变签名、邮政编码、日期或数值量的分类结果。这使得受扰动图像在肉眼看来自然真实。因此,这种"对抗性墨迹"攻击揭示了一个可能对安全与安保产生严重影响的漏洞。我们在真实数据上演示了新型攻击,并将其与现有算法进行对比。我们还研究了使用分量条件数来量化脆弱性的方法。