Adversarial Training of Two-Layer Polynomial and ReLU Activation Networks via Convex Optimization

Training neural networks which are robust to adversarial attacks remains an important problem in deep learning, especially as heavily overparameterized models are adopted in safety-critical settings. Drawing from recent work which reformulates the training problems for two-layer ReLU and polynomial activation networks as convex programs, we devise a convex semidefinite program (SDP) for adversarial training of two-layer polynomial activation networks and prove that the convex SDP achieves the same globally optimal solution as its nonconvex counterpart. The convex SDP is observed to improve robust test accuracy against $\ell_\infty$ attacks relative to the original convex training formulation on multiple datasets. Additionally, we present scalable implementations of adversarial training for two-layer polynomial and ReLU networks which are compatible with standard machine learning libraries and GPU acceleration. Leveraging these implementations, we retrain the final two fully connected layers of a Pre-Activation ResNet-18 model on the CIFAR-10 dataset with both polynomial and ReLU activations. The two `robustified' models achieve significantly higher robust test accuracies against $\ell_\infty$ attacks than a Pre-Activation ResNet-18 model trained with sharpness-aware minimization, demonstrating the practical utility of convex adversarial training on large-scale problems.

翻译：在深度学习中，训练对对抗攻击具有鲁棒性的神经网络仍然是一个重要问题，尤其是在安全关键场景中采用高度过参数化模型的情况下。基于近期将两层ReLU和多项式激活网络的训练问题重构为凸规划的研究，我们设计了一种用于两层多项式激活网络对抗训练的凸半定规划（SDP），并证明该凸SDP能达到与其非凸对应问题相同的全局最优解。实验观察到，在多个数据集上，该凸SDP相较于原始凸训练方法，针对$\ell_\infty$攻击的鲁棒测试准确率有所提升。此外，我们提出了适用于两层多项式网络和ReLU网络对抗训练的可扩展实现方案，这些方案兼容标准机器学习库与GPU加速。利用这些实现，我们在CIFAR-10数据集上使用多项式激活和ReLU激活，重新训练了一个预激活ResNet-18模型的最后两个全连接层。这两个“鲁棒化”模型针对$\ell_\infty$攻击的鲁棒测试准确率显著高于通过锐度感知最小化方法训练的预激活ResNet-18模型，从而证明了凸对抗训练在大规模问题上的实际效用。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

专知会员服务

46+阅读 · 2020年5月11日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日