We study three models of the problem of adversarial training in multiclass classification designed to construct robust classifiers against adversarial perturbations of data in the agnostic-classifier setting. We prove the existence of Borel measurable robust classifiers in each model and provide a unified perspective of the adversarial training problem, expanding the connections with optimal transport initiated by the authors in previous work and developing new connections between adversarial training in the multiclass setting and total variation regularization. As a corollary of our results, we prove the existence of Borel measurable solutions to the agnostic adversarial training problem in the binary classification setting, a result that improves results in the literature of adversarial training, where robust classifiers were only known to exist within the enlarged universal $\sigma$-algebra of the feature space.
翻译:我们研究了在不可知分类器设置下,为构建对数据对抗扰动具有鲁棒性分类器而设计的多分类对抗训练问题的三种模型。我们证明了每种模型中Borel可测鲁棒分类器的存在性,并提供了对抗训练问题的统一视角,深化了作者先前工作中建立的最优传输联系,同时发展了多分类背景下对抗训练与全变差正则化之间的新联系。作为结果的推论,我们证明了二分类不可知对抗训练问题中Borel可测解的存在性,这一结果改进了对抗训练文献中的现有结论——此前仅知鲁棒分类器存在于特征空间的扩展通用$\sigma$-代数中。