Time-frequency Network for Robust Speaker Recognition

The wide deployment of speech-based biometric systems usually demands high-performance speaker recognition algorithms. However, most of the prior works for speaker recognition either process the speech in the frequency domain or time domain, which may produce suboptimal results because both time and frequency domains are important for speaker recognition. In this paper, we attempt to analyze the speech signal in both time and frequency domains and propose the time-frequency network~(TFN) for speaker recognition by extracting and fusing the features in the two domains. Based on the recent advance of deep neural networks, we propose a convolution neural network to encode the raw speech waveform and the frequency spectrum into domain-specific features, which are then fused and transformed into a classification feature space for speaker recognition. Experimental results on the publicly available datasets TIMIT and LibriSpeech show that our framework is effective to combine the information in the two domains and performs better than the state-of-the-art methods for speaker recognition.

翻译：基于语音的生物识别系统的广泛应用通常需要高性能的说话人识别算法。然而，现有的大多数说话人识别方法要么在频域处理语音，要么在时域处理语音，这可能导致次优结果，因为时域和频域对说话人识别都至关重要。本文尝试同时分析语音信号在时域和频域的特征，通过提取并融合两个域的特征，提出了一种用于说话人识别的时间-频率网络（TFN）。基于深度神经网络的最新进展，我们设计了一种卷积神经网络，用于将原始语音波形和频谱编码为域特定特征，随后这些特征被融合并转换为用于说话人识别的分类特征空间。在公开数据集TIMIT和LibriSpeech上的实验结果表明，我们的框架能够有效结合两个域的信息，其性能优于目前最先进的说话人识别方法。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日