We study the consistency of surrogate risks for robust binary classification. It is common to learn robust classifiers by adversarial training, which seeks to minimize the expected $0$-$1$ loss when each example can be maliciously corrupted within a small ball. We give a simple and complete characterization of the set of surrogate loss functions that are \emph{consistent}, i.e., that can replace the $0$-$1$ loss without affecting the minimizing sequences of the original adversarial risk, for any data distribution. We also prove a quantitative version of adversarial consistency for the $\rho$-margin loss. Our results reveal that the class of adversarially consistent surrogates is substantially smaller than in the standard setting, where many common surrogates are known to be consistent.
翻译:我们研究了鲁棒二元分类中代理风险的一致性。通过对抗训练学习鲁棒分类器是常见做法,它试图在每个样本可在小范围内被恶意破坏时,最小化期望的$0$-$1$损失。我们给出了代理损失函数集的一个简单而完整的刻画,这些函数是\emph{一致的},即对于任何数据分布,它们可以在不影响原始对抗风险最小化序列的情况下替代$0$-$1$损失。我们还证明了$\rho$-间隔损失的对抗一致性的定量版本。我们的结果表明,对抗一致的代理类别比标准设置下要小得多,在标准设置中,许多常见代理函数已知是一致的。