This study focuses on the First VoicePrivacy Attacker Challenge within the ICASSP 2025 Signal Processing Grand Challenge, which aims to develop speaker verification systems capable of determining whether two anonymized speech signals are from the same speaker. However, differences between feature distributions of original and anonymized speech complicate this task. To address this challenge, we propose an attacker system that combines Data Augmentation enhanced feature representation and Speaker Identity Difference enhanced classifier to improve verification performance, termed DA-SID. Specifically, data augmentation strategies (i.e., data fusion and SpecAugment) are utilized to mitigate feature distribution gaps, while probabilistic linear discriminant analysis (PLDA) is employed to further enhance speaker identity difference. Our system significantly outperforms the baseline, demonstrating exceptional effectiveness and robustness against various voice anonymization systems, ultimately securing a top-5 ranking in the challenge.
翻译:本研究聚焦于ICASSP 2025信号处理重大挑战赛中的首届语音隐私攻击者挑战,该挑战旨在开发能够判断两个匿名化语音信号是否来自同一说话人的说话人验证系统。然而,原始语音与匿名化语音在特征分布上的差异使得这一任务变得复杂。为应对此挑战,我们提出一种攻击者系统,该系统结合了数据增强强化的特征表示与说话人身份差异强化的分类器以提升验证性能,称为DA-SID。具体而言,我们利用数据增强策略(即数据融合与SpecAugment)来缓解特征分布差异,同时采用概率线性判别分析(PLDA)以进一步增强说话人身份差异。我们的系统显著优于基线方法,展现出对多种语音匿名化系统的卓越有效性与鲁棒性,并最终在该挑战中获得了前五名的成绩。