Robustness to adversarial perturbations is of paramount concern in modern machine learning. One of the state-of-the-art methods for training robust classifiers is adversarial training, which involves minimizing a supremum-based surrogate risk. The statistical consistency of surrogate risks is well understood in the context of standard machine learning, but not in the adversarial setting. In this paper, we characterize which supremum-based surrogates are consistent for distributions absolutely continuous with respect to Lebesgue measure in binary classification. Furthermore, we obtain quantitative bounds relating adversarial surrogate risks to the adversarial classification risk. Lastly, we discuss implications for the $\cH$-consistency of adversarial training.
翻译:对抗扰动鲁棒性是现代机器学习中的关键问题。训练鲁棒分类器的最先进方法之一是对抗训练,该方法涉及最小化一种基于上确界的替代风险。在标准机器学习背景下,替代风险的统计一致性已得到充分理解,但在对抗环境中尚不明确。本文刻画了在二元分类中,对于关于勒贝格测度绝对连续的分布,哪些基于上确界的替代风险具有一致性。此外,我们获得了将对抗替代风险与对抗分类风险联系起来的定量界限。最后,我们讨论了这些结果对对抗训练$\cH$一致性的启示。