Interpretable Spectrum Transformation Attacks to Speaker Recognition

The success of adversarial attacks to speaker recognition is mainly in white-box scenarios. When applying the adversarial voices that are generated by attacking white-box surrogate models to black-box victim models, i.e. \textit{transfer-based} black-box attacks, the transferability of the adversarial voices is not only far from satisfactory, but also lacks interpretable basis. To address these issues, in this paper, we propose a general framework, named spectral transformation attack based on modified discrete cosine transform (STA-MDCT), to improve the transferability of the adversarial voices to a black-box victim model. Specifically, we first apply MDCT to the input voice. Then, we slightly modify the energy of different frequency bands for capturing the salient regions of the adversarial noise in the time-frequency domain that are critical to a successful attack. Unlike existing approaches that operate voices in the time domain, the proposed framework operates voices in the time-frequency domain, which improves the interpretability, transferability, and imperceptibility of the attack. Moreover, it can be implemented with any gradient-based attackers. To utilize the advantage of model ensembling, we not only implement STA-MDCT with a single white-box surrogate model, but also with an ensemble of surrogate models. Finally, we visualize the saliency maps of adversarial voices by the class activation maps (CAM), which offers an interpretable basis to transfer-based attacks in speaker recognition for the first time. Extensive comparison results with five representative attackers show that the CAM visualization clearly explains the effectiveness of STA-MDCT, and the weaknesses of the comparison methods; the proposed method outperforms the comparison methods by a large margin.

翻译：说话人识别的对抗攻击成功案例主要存在于白盒场景。当通过攻击白盒替代模型生成的对抗语音应用于黑盒受害者模型（即基于转移的黑盒攻击）时，这些对抗语音的可迁移性不仅远未达到理想水平，而且缺乏可解释的理论基础。为解决这些问题，本文提出一个通用框架——基于改进离散余弦变换的频谱变换攻击（STA-MDCT），以提升对抗语音对黑盒受害者模型的可迁移性。具体而言，我们首先对输入语音应用MDCT，然后轻微调整不同频带的能量，以捕获时频域中对成功攻击至关重要的对抗噪声显著区域。与现有在时域处理语音的方法不同，本框架在时频域操作，从而提升了攻击的可解释性、可迁移性和不可感知性。此外，该框架可兼容任何基于梯度的攻击方法。为利用模型集成的优势，我们不仅通过单个白盒替代模型实现STA-MDCT，还通过集成多个替代模型实现。最后，我们利用类激活图（CAM）可视化对抗语音的显著性图，首次为说话人识别中的基于转移的攻击提供了可解释基础。与五种代表性攻击方法的广泛对比结果表明，CAM可视化清晰地解释了STA-MDCT的有效性及对比方法的缺陷；所提方法在性能上大幅超越对比方法。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日