Artificial Neural Networks (ANNs) trained with Backpropagation (BP) show astounding performance and are increasingly often used in performing our daily life tasks. However, ANNs are highly vulnerable to adversarial attacks, which alter inputs with small targeted perturbations that drastically disrupt the models' performance. The most effective method to make ANNs robust against these attacks is adversarial training, in which the training dataset is augmented with exemplary adversarial samples. Unfortunately, this approach has the drawback of increased training complexity since generating adversarial samples is very computationally demanding. In contrast to ANNs, humans are not susceptible to adversarial attacks. Therefore, in this work, we investigate whether biologically-plausible learning algorithms are more robust against adversarial attacks than BP. In particular, we present an extensive comparative analysis of the adversarial robustness of BP and \textit{Present the Error to Perturb the Input To modulate Activity} (PEPITA), a recently proposed biologically-plausible learning algorithm, on various computer vision tasks. We observe that PEPITA has higher intrinsic adversarial robustness and, with adversarial training, has a more favourable natural-vs-adversarial performance trade-off as, for the same natural accuracies, PEPITA's adversarial accuracies decrease in average by 0.26% and BP's by 8.05%.
翻译:通过反向传播训练的人工神经网络展现出惊人的性能,并日益广泛地应用于日常任务。然而,人工神经网络极易受到对抗攻击的影响——攻击者通过施加微小的定向扰动改变输入,从而显著破坏模型性能。增强人工神经网络抗攻击能力的最有效方法是对抗训练,即在训练数据集中增补典型对抗样本。遗憾的是,该方法因生成对抗样本的计算成本极高而增加了训练复杂度。与人工神经网络不同,人类并不易受对抗攻击影响。因此,本研究探究生物 plausible 学习算法是否比反向传播具有更强的对抗鲁棒性。具体而言,我们在多种计算机视觉任务上,对反向传播与近期提出的生物 plausible 学习算法PEPITA(通过误差扰动输入以调节活动)进行了对抗鲁棒性的广泛对比分析。实验发现,PEPITA具有更强的内在对抗鲁棒性,且经对抗训练后,其在自然性能与对抗性能之间取得了更优的折中:在相同自然准确率条件下,PEPITA的对抗准确率平均下降0.26%,而反向传播则下降8.05%。