Deep neural networks (DNN) have found wide applicability in numerous fields due to their ability to accurately learn very complex input-output relations. Despite their accuracy and extensive use, DNNs are highly susceptible to adversarial attacks due to limited generalizability. For future progress in the field, it is essential to build DNNs that are robust to any kind of perturbations to the data points. In the past, many techniques have been proposed to robustify DNNs using first-order derivative information of the network. This paper proposes a new robustification approach based on control theory. A neural network architecture that incorporates feedback control, named Feedback Neural Networks, is proposed. The controller is itself a neural network, which is trained using regular and adversarial data such as to stabilize the system outputs. The novel adversarial training approach based on the feedback control architecture is called Feedback Looped Adversarial Training (FLAT). Numerical results on standard test problems empirically show that our FLAT method is more effective than the state-of-the-art to guard against adversarial attacks.
翻译:深度神经网络(DNN)因其能够精确学习极其复杂的输入-输出关系,已在众多领域得到广泛应用。尽管DNN具有高精度并得到广泛使用,但由于泛化能力有限,它们极易受到对抗性攻击。为了推动该领域的未来发展,构建能抵御数据点任何形式扰动的鲁棒DNN至关重要。过去,研究者提出了多种利用网络一阶导数信息来增强DNN鲁棒性的技术。本文提出了一种基于控制理论的新鲁棒化方法。我们提出了一种融入反馈控制的神经网络架构,命名为反馈神经网络。该控制器本身是一个神经网络,通过使用常规数据和对抗数据进行训练,从而稳定系统输出。这种基于反馈控制架构的新型对抗训练方法被称为反馈回路对抗训练(FLAT)。标准测试问题上的数值结果经验性地表明,我们的FLAT方法在防御对抗性攻击方面比当前最先进方法更为有效。