It is known that deep neural networks are vulnerable to adversarial attacks. Although Automatic Speaker Verification (ASV) built on top of deep neural networks exhibits robust performance in controlled scenarios, many studies confirm that ASV is vulnerable to adversarial attacks. The lack of a standard dataset is a bottleneck for further research, especially reproducible research. In this study, we developed an open-source adversarial attack dataset for speaker verification research. As an initial step, we focused on the over-the-air attack. An over-the-air adversarial attack involves a perturbation generation algorithm, a loudspeaker, a microphone, and an acoustic environment. The variations in the recording configurations make it very challenging to reproduce previous research. The AdvSV dataset is constructed using the Voxceleb1 Verification test set as its foundation. This dataset employs representative ASV models subjected to adversarial attacks and records adversarial samples to simulate over-the-air attack settings. The scope of the dataset can be easily extended to include more types of adversarial attacks. The dataset will be released to the public under the CC-BY license. In addition, we also provide a detection baseline for reproducible research.
翻译:摘要:众所周知,深度神经网络易受对抗攻击的影响。尽管基于深度神经网络的自动说话人验证(ASV)在受控场景中展现出鲁棒性能,但多项研究证实ASV对对抗攻击非常脆弱。标准数据集的缺乏是进一步研究(尤其是可重复性研究)的主要瓶颈。本研究针对说话人验证任务开发了一个开源对抗攻击数据集。作为初始阶段,我们聚焦于空中攻击场景。空中对抗攻击涉及扰动生成算法、扬声器、麦克风及声学环境。录音配置的差异使得复现先前研究极具挑战性。AdvSV数据集以Voxceleb1验证测试集为基础构建,采用代表性ASV模型遭受对抗攻击并记录对抗样本,以模拟空中攻击设置。该数据集可便捷扩展以涵盖更多类型的对抗攻击。数据集将以CC-BY许可证公开发布。此外,我们还为可重复性研究提供了检测基准。