Deep neural networks (DNN) have found wide applicability in numerous fields due to their ability to accurately learn very complex input-output relations. Despite their accuracy and extensive use, DNNs are highly susceptible to adversarial attacks due to limited generalizability. For future progress in the field, it is essential to build DNNs that are robust to any kind of perturbations to the data points. In the past, many techniques have been proposed to robustify DNNs using first-order derivative information of the network. This paper proposes a new robustification approach based on control theory. A neural network architecture that incorporates feedback control, named Feedback Neural Networks, is proposed. The controller is itself a neural network, which is trained using regular and adversarial data such as to stabilize the system outputs. The novel adversarial training approach based on the feedback control architecture is called Feedback Looped Adversarial Training (FLAT). Numerical results on standard test problems empirically show that our FLAT method is more effective than the state-of-the-art to guard against adversarial attacks.
翻译:深度神经网络因其能够精确学习极为复杂的输入输出关系,在众多领域得到了广泛应用。尽管深度神经网络精度高且使用广泛,但由于泛化能力有限,它们极易受到对抗攻击。为了该领域的未来发展,构建对数据点任何扰动都具有鲁棒性的深度神经网络至关重要。过去,人们提出了许多利用网络一阶导数信息来增强深度神经网络鲁棒性的技术。本文提出了一种基于控制理论的新型鲁棒性增强方法。我们提出了一种融合反馈控制的神经网络架构,称为反馈神经网络。控制器本身就是一个神经网络,它利用常规数据和对抗数据进行训练,以稳定系统输出。这种基于反馈控制架构的新型对抗训练方法被称为反馈回路对抗训练(FLAT)。在标准测试问题上的数值结果经验性地表明,我们的FLAT方法在抵御对抗攻击方面比现有最先进方法更为有效。