The widespread smart devices raise people's concerns of being eavesdropped on. To enhance voice privacy, recent studies exploit the nonlinearity in microphone to jam audio recorders with inaudible ultrasound. However, existing solutions solely rely on energetic masking. Their simple-form noise leads to several problems, such as high energy requirements and being easily removed by speech enhancement techniques. Besides, most of these solutions do not support authorized recording, which restricts their usage scenarios. In this paper, we design an efficient yet robust system that can jam microphones while preserving authorized recording. Specifically, we propose a novel phoneme-based noise with the idea of informational masking, which can distract both machines and humans and is resistant to denoising techniques. Besides, we optimize the noise transmission strategy for broader coverage and implement a hardware prototype of our system. Experimental results show that our system can reduce the recognition accuracy of recordings to below 50\% under all tested speech recognition systems, which is much better than existing solutions.
翻译:智能设备的普及引发了人们对被窃听的担忧。为增强语音隐私保护,近期研究利用麦克风的非线性特性,通过人耳不可闻的超声波干扰音频录音设备。然而,现有方案仅依赖能量掩蔽效应,其简单形式的噪声导致能量需求高、易被语音增强技术去除等问题。此外,这些方案大多不支持授权录音,限制了应用场景。本文设计了一种高效且鲁棒的干扰系统,在实现麦克风干扰的同时支持授权录音。具体而言,我们提出基于音素的新型噪声,其核心思想为信息掩蔽,既能干扰机器与人类的语音识别,又具备抗去噪能力。同时,我们优化了噪声传输策略以扩大覆盖范围,并实现了系统硬件原型。实验结果表明,该系统可将所有被测语音识别系统的录音识别准确率降至50%以下,性能显著优于现有方案。