DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition

Speaker recognition is a biometric modality that utilizes the speaker's speech segments to recognize the identity, determining whether the test speaker belongs to one of the enrolled speakers. In order to improve the robustness of the i-vector framework on cross-channel conditions and explore the nova method for applying deep learning to speaker recognition, the Stacked Auto-encoders are used to get the abstract extraction of the i-vector instead of applying PLDA. After pre-processing and feature extraction, the speaker and channel-independent speeches are employed for UBM training. The UBM is then used to extract the i-vector of the enrollment and test speech. Unlike the traditional i-vector framework, which uses linear discriminant analysis (LDA) to reduce dimension and increase the discrimination between speaker subspaces, this research use stacked auto-encoders to reconstruct the i-vector with lower dimension and different classifiers can be chosen to achieve final classification. The experimental results show that the proposed method achieves better performance than the state-of-the-art method.

翻译：说话人识别是一种利用说话人语音片段来识别身份的 biometric 模态，用于判断测试说话人是否属于已注册说话人之一。为提高 i-vector 框架在跨信道条件下的鲁棒性，并探索将深度学习应用于说话人识别的新方法，本文采用堆叠自编码器对 i-vector 进行抽象提取，以替代 PLDA。在预处理和特征提取之后，使用说话人与信道无关的语音进行 UBM 训练。随后，利用该 UBM 提取注册语音和测试语音的 i-vector。与传统 i-vector 框架使用线性判别分析（LDA）来降维并增强说话人子空间区分性不同，本研究采用堆叠自编码器重构低维 i-vector，并可选用不同分类器实现最终分类。实验结果表明，所提方法在性能上优于现有最先进方法。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日