In this study, we propose a timbre-reserved adversarial attack approach for speaker identification (SID) to not only exploit the weakness of the SID model but also preserve the timbre of the target speaker in a black-box attack setting. Particularly, we generate timbre-reserved fake audio by adding an adversarial constraint during the training of the voice conversion model. Then, we leverage a pseudo-Siamese network architecture to learn from the black-box SID model constraining both intrinsic similarity and structural similarity simultaneously. The intrinsic similarity loss is to learn an intrinsic invariance, while the structural similarity loss is to ensure that the substitute SID model shares a similar decision boundary to the fixed black-box SID model. The substitute model can be used as a proxy to generate timbre-reserved fake audio for attacking. Experimental results on the Audio Deepfake Detection (ADD) challenge dataset indicate that the attack success rate of our proposed approach yields up to 60.58% and 55.38% in the white-box and black-box scenarios, respectively, and can deceive both human beings and machines.
翻译:本研究提出一种面向说话人识别(SID)的保留音色对抗攻击方法,该方法不仅能够利用SID模型的弱点,还能在黑盒攻击场景中保留目标说话人的音色特征。具体而言,我们通过在语音转换模型训练过程中添加对抗约束条件,生成保留音色的伪造音频。随后,利用伪孪生网络架构同时约束内在相似性与结构相似性,从而学习黑盒SID模型。其中,内在相似性损失用于学习内在不变性,而结构相似性损失则确保替代SID模型与固定黑盒SID模型共享相似的决策边界。该替代模型可作为代理生成保留音色的伪造音频进行攻击。在音频深度伪造检测(ADD)挑战数据集上的实验结果表明,本文方法在白盒与黑盒场景下的攻击成功率分别可达60.58%和55.38%,并且能够同时欺骗人类与机器。