This study focuses on the First VoicePrivacy Attacker Challenge within the ICASSP 2025 Signal Processing Grand Challenge, which aims to develop speaker verification systems capable of determining whether two anonymized speech signals are from the same speaker. However, differences between feature distributions of original and anonymized speech complicate this task. To address this challenge, we propose an attacker system that combines Data Augmentation enhanced feature representation and Speaker Identity Difference enhanced classifier to improve verification performance, termed DA-SID. Specifically, data augmentation strategies (i.e., data fusion and SpecAugment) are utilized to mitigate feature distribution gaps, while probabilistic linear discriminant analysis (PLDA) is employed to further enhance speaker identity difference. Our system significantly outperforms the baseline, demonstrating exceptional effectiveness and robustness against various voice anonymization systems, ultimately securing a top-5 ranking in the challenge.
翻译:本研究聚焦于ICASSP 2025信号处理大挑战中的首届VoicePrivacy攻击者挑战,该挑战旨在开发能够判断两个匿名化语音信号是否来自同一说话人的说话人验证系统。然而,原始语音与匿名化语音之间的特征分布差异使得该任务变得复杂。为应对这一挑战,我们提出了一种结合数据增强特征表示与说话人身份差异增强分类器的攻击者系统,以提升验证性能,该系统被命名为DA-SID。具体而言,我们采用数据增强策略(即数据融合与SpecAugment)来缓解特征分布差异,同时利用概率线性判别分析(PLDA)进一步强化说话人身份差异。我们的系统显著超越了基线方法,展现出对多种语音匿名化系统优异的有效性与鲁棒性,并最终在挑战中取得了前五名的成绩。