Machine Learning as a Service (MLaaS) has gained popularity due to advancements in Deep Neural Networks (DNNs). However, untrusted third-party platforms have raised concerns about AI security, particularly in backdoor attacks. Recent research has shown that speech backdoors can utilize transformations as triggers, similar to image backdoors. However, human ears can easily be aware of these transformations, leading to suspicion. In this paper, we propose PaddingBack, an inaudible backdoor attack that utilizes malicious operations to generate poisoned samples, rendering them indistinguishable from clean ones. Instead of using external perturbations as triggers, we exploit the widely-used speech signal operation, padding, to break speaker recognition systems. Experimental results demonstrate the effectiveness of our method, achieving a significant attack success rate while retaining benign accuracy. Furthermore, PaddingBack demonstrates the ability to resist defense methods and maintain its stealthiness against human perception.
翻译:机器学习即服务(MLaaS)因深度神经网络(DNNs)的进步而日益流行。然而,不可信的第三方平台引发了人工智能安全问题,尤其是后门攻击。近期研究表明,语音后门可像图像后门一样利用变换作为触发器,但人耳容易察觉这些变换,从而引发怀疑。本文提出PaddingBack——一种利用恶意操作生成中毒样本且使其与干净样本难以区分的不可听后门攻击方法。我们未采用外部扰动作为触发器,而是利用广泛使用的语音信号操作——填充(padding)——来破解说话人识别系统。实验结果表明,该方法在保持良性准确率的同时实现了显著攻击成功率。此外,PaddingBack展现了抵御防御方法并保持对人类感知隐蔽性的能力。