We study the consistency of surrogate risks for robust binary classification. It is common to learn robust classifiers by adversarial training, which seeks to minimize the expected $0$-$1$ loss when each example can be maliciously corrupted within a small ball. We give a simple and complete characterization of the set of surrogate loss functions that are \emph{consistent}, i.e., that can replace the $0$-$1$ loss without affecting the minimizing sequences of the original adversarial risk, for any data distribution. We also prove a quantitative version of adversarial consistency for the $\rho$-margin loss. Our results reveal that the class of adversarially consistent surrogates is substantially smaller than in the standard setting, where many common surrogates are known to be consistent.
翻译:我们研究鲁棒二分类中替代风险的一致性。通过对抗训练学习鲁棒分类器是常见做法,该训练旨在最小化当每个样本可在小范围内被恶意扰动时的期望$0$-$1$损失。我们给出了具有\emph{一致性}的替代损失函数集的简单完全刻画,即这些函数可在不影响原始对抗风险最小化序列的前提下替代$0$-$1$损失,且适用于任意数据分布。此外,我们还证明了$\rho$-间隔损失的对抗一致性定量版本。研究结果表明,与标准设定中许多常见替代函数已被证明具有一致性不同,对抗一致性替代函数的类别显著缩小。