Evil Operation: Breaking Speaker Recognition with PaddingBack

Machine Learning as a Service (MLaaS) has gained popularity due to advancements in machine learning. However, untrusted third-party platforms have raised concerns about AI security, particularly in backdoor attacks. Recent research has shown that speech backdoors can utilize transformations as triggers, similar to image backdoors. However, human ears easily detect these transformations, leading to suspicion. In this paper, we introduce PaddingBack, an inaudible backdoor attack that utilizes malicious operations to make poisoned samples indistinguishable from clean ones. Instead of using external perturbations as triggers, we exploit the widely used speech signal operation, padding, to break speaker recognition systems. Our experimental results demonstrate the effectiveness of the proposed approach, achieving a significantly high attack success rate while maintaining a high rate of benign accuracy. Furthermore, PaddingBack demonstrates the ability to resist defense methods while maintaining its stealthiness against human perception. The results of the stealthiness experiment have been made available at https://nbufabio25.github.io/paddingback/.

翻译：机器学习即服务（MLaaS）因机器学习技术的进步而日益普及。然而，不可信的第三方平台引发了人工智能安全问题，尤其是后门攻击。近期研究表明，语音后门可像图像后门一样利用变换作为触发器。然而，人耳易察觉这些变换，从而引发怀疑。本文提出PaddingBack——一种不可听的后门攻击，通过恶意操作使毒化样本与干净样本难以区分。我们未采用外部扰动作为触发器，而是利用广泛使用的语音信号操作——填充（padding）来攻破说话人识别系统。实验结果表明，所提方法在保持高良性准确率的同时，实现了极高的攻击成功率。此外，PaddingBack在保持对人类感知隐蔽性的同时，展现出了抵抗防御方法的能力。隐蔽性实验结果已发布于https://nbufabio25.github.io/paddingback/。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日