The rapid advancement of spoofing algorithms necessitates the development of robust detection methods capable of accurately identifying emerging fake audio. Traditional approaches, such as finetuning on new datasets containing these novel spoofing algorithms, are computationally intensive and pose a risk of impairing the acquired knowledge of known fake audio types. To address these challenges, this paper proposes an innovative approach that mitigates the limitations associated with finetuning. We introduce the concept of training low-rank adaptation matrices tailored specifically to the newly emerging fake audio types. During the inference stage, these adaptation matrices are combined with the existing model to generate the final prediction output. Extensive experimentation is conducted to evaluate the efficacy of the proposed method. The results demonstrate that our approach effectively preserves the prediction accuracy of the existing model for known fake audio types. Furthermore, our approach offers several advantages, including reduced storage memory requirements and lower equal error rates compared to conventional finetuning methods, particularly on specific spoofing algorithms.
翻译:快速发展的伪造算法要求开发能够准确识别新兴假音频的鲁棒检测方法。传统方法(如在新数据集上微调以应对新型伪造算法)计算成本高昂,且可能损害对已知假音频类型的识别能力。为应对这些挑战,本文提出一种创新方法以缓解微调的相关局限。我们引入针对新型假音频类型定制的低秩自适应矩阵训练概念。在推理阶段,将这些自适应矩阵与现有模型结合生成最终预测输出。通过广泛实验验证所提方法的有效性,结果表明该方法在保持现有模型对已知假音频类型预测精度的同时,在存储内存需求更低、等错误率更小等优势上优于传统微调方法,尤其针对特定伪造算法表现更优。