Speaker recognition technology is applied to various tasks, from personal virtual assistants to secure access systems. However, the robustness of these systems against adversarial attacks, particularly to additive perturbations, remains a significant challenge. In this paper, we pioneer applying robustness certification techniques to speaker recognition, initially developed for the image domain. Our work covers this gap by transferring and improving randomized smoothing certification techniques against norm-bounded additive perturbations for classification and few-shot learning tasks to speaker recognition. We demonstrate the effectiveness of these methods on VoxCeleb 1 and 2 datasets for several models. We expect this work to improve the robustness of voice biometrics and accelerate the research of certification methods in the audio domain.
翻译:说话人识别技术已应用于从个人虚拟助手到安全访问系统的多种任务。然而,这些系统对抗对抗性攻击(特别是对加性扰动)的鲁棒性仍然是一个重大挑战。本文率先将最初为图像领域开发的鲁棒性认证技术应用于说话人识别。我们的工作通过将针对范数有界加性扰动的随机平滑认证技术迁移并改进至说话人识别中的分类和少样本学习任务,填补了这一空白。我们在VoxCeleb 1和2数据集上对多个模型验证了这些方法的有效性。我们期望这项工作能够提升声纹生物识别的鲁棒性,并加速音频领域认证方法的研究。