Voice Recognition Systems (VRSs) employ deep learning for speech recognition and speaker recognition. They have been widely deployed in various real-world applications, from intelligent voice assistance to telephony surveillance and biometric authentication. However, prior research has revealed the vulnerability of VRSs to backdoor attacks, which pose a significant threat to the security and privacy of VRSs. Unfortunately, existing literature lacks a thorough review on this topic. This paper fills this research gap by conducting a comprehensive survey on backdoor attacks against VRSs. We first present an overview of VRSs and backdoor attacks, elucidating their basic knowledge. Then we propose a set of evaluation criteria to assess the performance of backdoor attack methods. Next, we present a comprehensive taxonomy of backdoor attacks against VRSs from different perspectives and analyze the characteristic of different categories. After that, we comprehensively review existing attack methods and analyze their pros and cons based on the proposed criteria. Furthermore, we review classic backdoor defense methods and generic audio defense techniques. Then we discuss the feasibility of deploying them on VRSs. Finally, we figure out several open issues and further suggest future research directions to motivate the research of VRSs security.
翻译:语音识别系统(VRS)利用深度学习实现语音识别和说话人识别,已广泛应用于从智能语音助手到电话监控及生物特征认证等实际场景。然而,已有研究表明语音识别系统易受后门攻击威胁,这类攻击对系统的安全性和隐私性构成重大挑战。遗憾的是,现有文献对此缺乏系统性梳理。本文通过全面综述针对语音识别系统的后门攻击方法填补这一研究空白。首先概述语音识别系统与后门攻击的基本原理,其次提出评估后门攻击方法性能的准则体系,继而从多维度构建针对语音识别系统的后门攻击分类框架并分析各类别特征,随后基于所提准则系统评述现有攻击方法及其优劣。此外,本文梳理了经典后门防御方法与通用音频防御技术,并探讨其在语音识别系统中的部署可行性。最后,我们指出若干待解决问题并展望未来研究方向,以推动语音识别系统安全研究的发展。