Recent works have tried to increase the verifiability of adversarially trained networks by running the attacks over domains larger than the original perturbations and adding various regularization terms to the objective. However, these algorithms either underperform or require complex and expensive stage-wise training procedures, hindering their practical applicability. We present IBP-R, a novel verified training algorithm that is both simple and effective. IBP-R induces network verifiability by coupling adversarial attacks on enlarged domains with a regularization term, based on inexpensive interval bound propagation, that minimizes the gap between the non-convex verification problem and its approximations. By leveraging recent branch-and-bound frameworks, we show that IBP-R obtains state-of-the-art verified robustness-accuracy trade-offs for small perturbations on CIFAR-10 while training significantly faster than relevant previous work. Additionally, we present UPB, a novel branching strategy that, relying on a simple heuristic based on $\beta$-CROWN, reduces the cost of state-of-the-art branching algorithms while yielding splits of comparable quality.
翻译:近期研究尝试通过在大于原始扰动的域上运行攻击,并在目标函数中添加多种正则化项,以提高对抗训练网络的可验证性。然而,这些算法要么性能欠佳,要么需要复杂且昂贵的分阶段训练流程,阻碍了其实际应用。我们提出IBP-R,一种既简洁又高效的新型可验证训练算法。IBP-R通过将放大域上的对抗攻击与基于低成本区间边界传播的正则化项相结合,最小化非凸验证问题与其近似之间的差距,从而提升网络可验证性。借助最新的分支定界框架,我们证明IBP-R在CIFAR-10数据集上针对小扰动可获得当前最优的验证鲁棒性-准确性权衡,同时训练速度显著快于先前相关工作。此外,我们提出UPB(一种基于$\beta$-CROWN简单启发式的新型分支策略),在产生质量相当的分支结果的同时,降低了当前最优分支算法的计算成本。