In this paper, we propose privacy-preserving methods with a secret key for convolutional neural network (CNN)-based models in speech processing tasks. In environments where untrusted third parties, like cloud servers, provide CNN-based systems, ensuring the privacy of speech queries becomes essential. This paper proposes encryption methods for speech queries using secret keys and a model structure that allows for encrypted queries to be accepted without decryption. Our approach introduces three types of secret keys: Shuffling, Flipping, and random orthogonal matrix (ROM). In experiments, we demonstrate that when the proposed methods are used with the correct key, identification performance did not degrade. Conversely, when an incorrect key is used, the performance significantly decreased. Particularly, with the use of ROM, we show that even with a relatively small key space, high privacy-preserving performance can be maintained many speech processing tasks. Furthermore, we also demonstrate the difficulty of recovering original speech from encrypted queries in various robustness evaluations.
翻译:本文针对语音处理任务中基于卷积神经网络(CNN)的模型,提出了一种基于密钥的隐私保护方法。在由不可信第三方(如云服务器)提供CNN系统的环境中,确保语音查询的隐私性至关重要。本文提出了使用密钥对语音查询进行加密的方法,以及一种无需解密即可接受加密查询的模型结构。我们的方法引入了三种类型的密钥:置换密钥、翻转密钥和随机正交矩阵(ROM)密钥。实验结果表明,当使用正确密钥时,所提方法的识别性能未出现下降;反之,当使用错误密钥时,性能显著降低。特别地,通过使用ROM密钥,我们证明即使在密钥空间相对较小的情况下,仍能在多种语音处理任务中保持较高的隐私保护性能。此外,通过多项鲁棒性评估,我们也验证了从加密查询中恢复原始语音的困难性。