Existing certified training methods can only train models to be robust against a certain perturbation type (e.g. $l_\infty$ or $l_2$). However, an $l_\infty$ certifiably robust model may not be certifiably robust against $l_2$ perturbation (and vice versa) and also has low robustness against other perturbations (e.g. geometric and patch transformation). By constructing a theoretical framework to analyze and mitigate the tradeoff, we propose the first multi-norm certified training framework \textbf{CURE}, consisting of several multi-norm certified training methods, to attain better \emph{union robustness} when training from scratch or fine-tuning a pre-trained certified model. Inspired by our theoretical findings, we devise bound alignment and connect natural training with certified training for better union robustness. Compared with SOTA-certified training, \textbf{CURE} improves union robustness to $32.0\%$ on MNIST, $25.8\%$ on CIFAR-10, and $10.6\%$ on TinyImagenet across different epsilon values. It leads to better generalization on a diverse set of challenging unseen geometric and patch perturbations to $6.8\%$ and $16.0\%$ on CIFAR-10. Overall, our contributions pave a path towards \textit{generalized certified robustness}.
翻译:现有的认证训练方法仅能训练模型对特定扰动类型(如$l_\infty$或$l_2$)具有鲁棒性。然而,一个对$l_\infty$扰动具有认证鲁棒性的模型可能无法对$l_2$扰动实现认证鲁棒性(反之亦然),且对其他扰动(如几何变换和块状变换)的鲁棒性较低。通过构建理论框架分析并缓解这一权衡,我们提出了首个多范数认证训练框架\textbf{CURE},该框架包含多种多范数认证训练方法,可在从头训练或微调预训练认证模型时实现更优的\textit{联合鲁棒性}。受理论发现启发,我们设计了边界对齐机制,将自然训练与认证训练相结合以增强联合鲁棒性。与最先进的认证训练方法相比,\textbf{CURE}在不同epsilon取值下,将MNIST的联合鲁棒性提升至$32.0\%$,CIFAR-10提升至$25.8\%$,TinyImagenet提升至$10.6\%$。该方法在CIFAR-10上对一组具有挑战性的未见几何扰动和块状扰动实现了$6.8\%$和$16.0\%$的泛化性能提升。总体而言,我们的贡献为迈向\textit{广义鲁棒性认证}铺平了道路。