HiddenSpeaker: Generate Imperceptible Unlearnable Audios for Speaker Verification System

In recent years, the remarkable advancements in deep neural networks have brought tremendous convenience. However, the training process of a highly effective model necessitates a substantial quantity of samples, which brings huge potential threats, like unauthorized exploitation with privacy leakage. In response, we propose a framework named HiddenSpeaker, embedding imperceptible perturbations within the training speech samples and rendering them unlearnable for deep-learning-based speaker verification systems that employ large-scale speakers for efficient training. The HiddenSpeaker utilizes a simplified error-minimizing method named Single-Level Error-Minimizing (SLEM) to generate specific and effective perturbations. Additionally, a hybrid objective function is employed for human perceptual optimization, ensuring the perturbation is indistinguishable from human listeners. We conduct extensive experiments on multiple state-of-the-art (SOTA) models in the speaker verification domain to evaluate HiddenSpeaker. Our results demonstrate that HiddenSpeaker not only deceives the model with unlearnable samples but also enhances the imperceptibility of the perturbations, showcasing strong transferability across different models.

翻译：近年来，深度神经网络的显著进展带来了巨大便利。然而，高效模型的训练过程需要大量样本，这带来了巨大的潜在威胁，例如伴随隐私泄露的未经授权利用。为此，我们提出名为HiddenSpeaker的框架，将不可感知的扰动嵌入训练语音样本中，使其对采用大规模说话人数据进行高效训练的基于深度学习的说话人验证系统变得不可学习。HiddenSpeaker利用一种简化的误差最小化方法——单级误差最小化（SLEM）来生成特定且有效的扰动。此外，通过混合目标函数进行人类感知优化，确保扰动对人类听者而言无法区分。我们在说话人验证领域的多个最先进模型上进行了广泛实验以评估HiddenSpeaker。结果表明，HiddenSpeaker不仅能用不可学习样本欺骗模型，还增强了扰动的不可感知性，并在不同模型间展现出强大的可迁移性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/