We study the consistency of surrogate risks for robust binary classification. It is common to learn robust classifiers by adversarial training, which seeks to minimize the expected $0$-$1$ loss when each example can be maliciously corrupted within a small ball. We give a simple and complete characterization of the set of surrogate loss functions that are \emph{consistent}, i.e., that can replace the $0$-$1$ loss without affecting the minimizing sequences of the original adversarial risk, for any data distribution. We also prove a quantitative version of adversarial consistency for the $\rho$-margin loss. Our results reveal that the class of adversarially consistent surrogates is substantially smaller than in the standard setting, where many common surrogates are known to be consistent.
翻译:我们研究了鲁棒二分类中代理风险的一致性。通过对抗训练学习鲁棒分类器是一种常见方法,该方法试图在样本可在小范围内被恶意扰动的情况下,最小化期望的$0$-$1$损失。我们给出了一类代理损失函数简洁而完整的刻画,这些函数被称为“一致的”,即在任何数据分布下,它们可以在不改变原始对抗风险最小化序列的前提下替代$0$-$1$损失。我们还证明了$\rho$-间隔损失的对抗一致性定量版本。我们的结果表明,与标准设定(其中许多常见代理函数已被证明是一致性的)相比,对抗一致性代理函数的类别在实质上更小。