Machine learning models have demonstrated remarkable success across diverse domains but remain vulnerable to adversarial attacks. Empirical defence mechanisms often fall short, as new attacks constantly emerge, rendering existing defences obsolete. A paradigm shift from empirical defences to certification-based defences has been observed in response. Randomized smoothing has emerged as a promising technique among notable advancements. This study reviews the theoretical foundations, empirical effectiveness, and applications of randomized smoothing in verifying machine learning classifiers. We provide an in-depth exploration of the fundamental concepts underlying randomized smoothing, highlighting its theoretical guarantees in certifying robustness against adversarial perturbations. Additionally, we discuss the challenges of existing methodologies and offer insightful perspectives on potential solutions. This paper is novel in its attempt to systemise the existing knowledge in the context of randomized smoothing.
翻译:机器学习模型在多个领域取得了显著成功,但仍易受对抗性攻击。经验性防御机制往往效果有限,因为新攻击不断涌现,使得现有防御措施过时。为此,学术界观察到从经验性防御向基于认证的防御的范式转变。在诸多进展中,随机平滑已成为一种具有前景的技术。本研究回顾了随机平滑在验证机器学习分类器方面的理论基础、实证有效性及应用实例。我们深入探讨了随机平滑的基本概念,着重阐述其在对抗扰动下认证鲁棒性的理论保证。此外,我们讨论了现有方法的挑战,并对潜在解决方案提出了富有洞见的观点。本文的新颖之处在于系统性地整合了随机平滑领域的现有知识。