基于w2v-BERT 2.0与知识蒸馏引导结构化剪枝的说话人验证增强 (Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning)

Large-scale self-supervised Pre-Trained Models (PTMs) have shown significant improvements in the speaker verification (SV) task by providing rich feature representations. In this paper, we utilize w2v-BERT 2.0, a model with approximately 600 million parameters trained on 4.5 million hours of unlabeled data across 143 languages, for the SV task. The MFA structure with Layer Adapter is employed to process the multi-layer feature outputs from the PTM and extract speaker embeddings. Additionally, we incorporate LoRA for efficient fine-tuning. Our model achieves state-of-the-art results with 0.12% and 0.55% EER on the Vox1-O and Vox1-H test sets, respectively. Furthermore, we apply knowledge distillation guided structured pruning, reducing the model size by 80% while achieving only a 0.04% EER degradation. Source code and models are released at https://github.com/ZXHY-82/w2v-BERT-2.0_SV.

翻译：大规模自监督预训练模型通过提供丰富的特征表示，在说话人验证任务中展现出显著的性能提升。本文利用w2v-BERT 2.0模型进行说话人验证，该模型包含约6亿参数，并在涵盖143种语言的450万小时无标注数据上训练。我们采用带有层适配器的MFA结构来处理预训练模型的多层特征输出，并提取说话人嵌入。此外，我们整合了LoRA以实现高效微调。我们的模型在Vox1-O和Vox1-H测试集上分别取得了0.12%和0.55%的等错误率，达到了最先进的性能。进一步地，我们应用了知识蒸馏引导的结构化剪枝，在模型尺寸减小80%的同时，仅带来0.04%的等错误率性能下降。源代码与模型已发布于 https://github.com/ZXHY-82/w2v-BERT-2.0_SV。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日