Visualizing data augmentation in deep speaker recognition - 专知论文

会员服务 ·

0

Deep Speaker · 稳健性 · MoDELS · 数据增强 · 可理解性 ·

2023 年 5 月 25 日

Visualizing data augmentation in deep speaker recognition

翻译：深度说话人识别中的数据增广可视化

Pengqi Li,Lantian Li,Askar Hamdulla,Dong Wang

from arxiv, to be published in INTERSPEECH 2023

Visualization is of great value in understanding the internal mechanisms of neural networks. Previous work found that LayerCAM is a reliable visualization tool for deep speaker models. In this paper, we use LayerCAM to analyze the widely-adopted data augmentation (DA) approach, to understand how it leads to model robustness. We conduct experiments on the VoxCeleb1 dataset for speaker identification, which shows that both vanilla and activation-based (Act) DA approaches enhance robustness against interference, with Act DA being consistently superior. Visualization with LayerCAM suggests DA helps models learn to delete temporal-frequency (TF) bins that are corrupted by interference. The `learn to delete' behavior explained why DA models are more robust than clean models, and why the Act DA is superior over the vanilla DA when the interference is nontarget speech. However, LayerCAM still cannot clearly explain the superiority of Act DA in other situations, suggesting further research.

翻译：可视化对于理解神经网络内部机制具有重要价值。既往研究发现，LayerCAM是深度说话人模型的可靠可视化工具。本文利用LayerCAM分析广泛采用的数据增广方法，以理解其如何提升模型鲁棒性。我们在VoxCeleb1数据集上进行说话人识别实验，结果表明：基础数据增广与激活驱动数据增广均能增强模型对抗干扰的鲁棒性，且激活驱动方法始终表现更优。LayerCAM可视化揭示，数据增广帮助模型学会删除被干扰污染的时频格点。"学会删除"行为解释了为何增广模型比干净模型更鲁棒，以及为何在非目标语音干扰条件下激活驱动方法优于基础方法。然而，LayerCAM仍无法明确解释激活驱动方法在其他情形的优势，这表明需要进一步研究。

0

相关内容

Deep Speaker

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

面心立方金属静力韧性的内外尺度效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

窄带隙梯形和盘状共轭大分子材料的设计、合成与性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

高功率因数光伏并网发电系统的研究

国家自然科学基金

0+阅读 · 2011年12月31日

《软件学报》学术期刊

国家自然科学基金

6+阅读 · 2011年12月31日

基于协同半监督学习和稀疏表示的极化SAR地物分类

国家自然科学基金

0+阅读 · 2011年12月31日

基于半监督学习的高分辨遥感影像信息提取方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

蒙特卡罗模拟和偏最小二乘法（MC-PLS）的结合: 新的快速的定量X-ray mapping分析方法

国家自然科学基金

0+阅读 · 2009年12月31日

柔性铁电驻极体薄膜几个基础问题的研究

国家自然科学基金

0+阅读 · 2008年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study

Arxiv

0+阅读 · 2023年7月13日

Free-Form Composition Networks for Egocentric Action Recognition

Arxiv

0+阅读 · 2023年7月13日

Improving Small Language Models on PubMedQA via Generative Data Augmentation

Arxiv

0+阅读 · 2023年7月12日

Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

Arxiv

0+阅读 · 2023年7月12日

One-Shot Learning for Periocular Recognition: Exploring the Effect of Domain Adaptation and Data Bias on Deep Representations

Arxiv

0+阅读 · 2023年7月11日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

1+阅读 · 今天14:49

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

1+阅读 · 今天14:47

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

1+阅读 · 今天14:45

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

3+阅读 · 今天14:22

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

4+阅读 · 今天13:50

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

3+阅读 · 今天13:33

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

3+阅读 · 今天13:30

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

3+阅读 · 今天13:28

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

3+阅读 · 今天13:13

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

专知会员服务

2+阅读 · 今天13:10

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

专知会员服务

5+阅读 · 6月16日

多模态代码智能综述：从视觉输入到可执行代码系统

多模态代码智能综述：从视觉输入到可执行代码系统

专知会员服务

7+阅读 · 6月16日

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

5+阅读 · 6月16日

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

5+阅读 · 6月16日

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

15+阅读 · 6月16日

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

学习数据的几何：形状空间分析数学综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

相关论文

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study

Arxiv

0+阅读 · 2023年7月13日

Free-Form Composition Networks for Egocentric Action Recognition

Arxiv

0+阅读 · 2023年7月13日

Improving Small Language Models on PubMedQA via Generative Data Augmentation

Arxiv

0+阅读 · 2023年7月12日

Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

Arxiv

0+阅读 · 2023年7月12日

One-Shot Learning for Periocular Recognition: Exploring the Effect of Domain Adaptation and Data Bias on Deep Representations

Arxiv

0+阅读 · 2023年7月11日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

相关基金

面心立方金属静力韧性的内外尺度效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

窄带隙梯形和盘状共轭大分子材料的设计、合成与性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

高功率因数光伏并网发电系统的研究

国家自然科学基金

0+阅读 · 2011年12月31日

《软件学报》学术期刊

国家自然科学基金

6+阅读 · 2011年12月31日

基于协同半监督学习和稀疏表示的极化SAR地物分类

国家自然科学基金

0+阅读 · 2011年12月31日

基于半监督学习的高分辨遥感影像信息提取方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

蒙特卡罗模拟和偏最小二乘法（MC-PLS）的结合: 新的快速的定量X-ray mapping分析方法

国家自然科学基金

0+阅读 · 2009年12月31日

柔性铁电驻极体薄膜几个基础问题的研究

国家自然科学基金

0+阅读 · 2008年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员