Deep neural networks are known to be vulnerable to small adversarial perturbations in test data. To defend against adversarial attacks, probabilistic classifiers have been proposed as an alternative to deterministic ones. However, literature has conflicting findings on the effectiveness of probabilistic classifiers in comparison to deterministic ones. In this paper, we clarify the role of randomization in building adversarially robust classifiers. Given a base hypothesis set of deterministic classifiers, we show the conditions under which a randomized ensemble outperforms the hypothesis set in adversarial risk, extending previous results. Additionally, we show that for any probabilistic classifier (including randomized ensembles), there exists a deterministic classifier that outperforms it. Finally, we give an explicit description of the deterministic hypothesis set that contains such a deterministic classifier for many types of commonly used probabilistic classifiers, i.e. randomized ensembles and parametric/input noise injection.
翻译:深度神经网络已知容易受到测试数据中小幅对抗性扰动的攻击。为了抵御对抗攻击,概率分类器被提出作为确定性分类器的替代方案。然而,文献中关于概率分类器与确定性分类器相比的有效性存在相互矛盾的研究结果。在本文中,我们阐明了随机化在构建对抗鲁棒分类器中的作用。给定一个确定性分类器的基假设集,我们展示了在何种条件下随机化集成在对抗风险上优于该假设集,从而扩展了先前结果。此外,我们证明对于任何概率分类器(包括随机化集成),存在一个确定性分类器表现更优。最后,我们明确描述了包含此类确定性分类器的确定性假设集,该假设集适用于许多常见的概率分类器类型,即随机化集成与参数/输入噪声注入。