BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition - 专知论文

会员服务 ·

0

自动语音识别 · SOTA · 语音识别 · Performer · MoDELS ·

2021 年 10 月 1 日

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

翻译：BigSSL:探索大规模半监督学习的前沿,以自动语音识别

Yu Zhang,Daniel S. Park,Wei Han,James Qin,Anmol Gulati,Joel Shor,Aren Jansen,Yuanzhong Xu,Yanping Huang,Shibo Wang,Zongwei Zhou,Bo Li,Min Ma,William Chan,Jiahui Yu,Yongqiang Wang,Liangliang Cao,Khe Chai Sim,Bhuvana Ramabhadran,Tara N. Sainath,Françoise Beaufays,Zhifeng Chen,Quoc V. Le,Chung-Cheng Chiu,Ruoming Pang,Yonghui Wu

from arxiv, 14 pages, 7 figures, 13 tables; v2: minor corrections, reference baselines and bibliography updated

We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models pre-trained using large, diverse unlabeled datasets containing approximately a million hours of audio. We find that the combination of pre-training, self-training and scaling up model size greatly increases data efficiency, even for extremely large tasks with tens of thousands of hours of labeled data. In particular, on an ASR task with 34k hours of labeled data, by fine-tuning an 8 billion parameter pre-trained Conformer model we can match state-of-the-art (SoTA) performance with only 3% of the training data and significantly improve SoTA with the full training set. We also report on the universal benefits gained from using big pre-trained and self-trained models for a large set of downstream tasks that cover a wide range of speech domains and span multiple orders of magnitudes of dataset sizes, including obtaining SoTA performance on many public benchmarks. In addition, we utilize the learned representation of pre-trained networks to achieve SoTA results on non-ASR tasks.

翻译：我们总结了使用大型自动语音识别(ASR)模型所做的大量努力的结果,这些模型是使用含有大约100万小时音频的大型、多种无标签数据集进行预先培训的。我们发现,将培训前、自我培训和扩大模型规模结合起来,大大提高了数据效率,即使对于涉及数万小时标签数据的巨大任务也是如此。特别是,在具有34千小时标签数据的ASR任务方面,我们通过微调一个80亿参数的预培训后模型,我们可以将培训前最先进的功能与仅3%的培训数据匹配起来,并大大改进SoTA全套培训。我们还报告了使用大型培训前和自我培训模式,用于涵盖广泛的语音领域并跨越多层次数据集大小的大型下游任务,包括在许多公共基准上获得SoTA的绩效,从而获得普遍效益。此外,我们利用培训前网络的经验,在非ASA任务上取得结果。

0

相关内容

自动语音识别

自动语音识别

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

62+阅读 · 2020年5月9日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

专知会员服务

148+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【伯克利 | 情感计算】大规模异构多媒体数据的情感计算:综述论文，37页pdf，171篇参考文献，Affective Computing for Large-Scale Heterogeneous Multimedia Data: A Survey

【伯克利 | 情感计算】大规模异构多媒体数据的情感计算:综述论文，37页pdf，171篇参考文献，Affective Computing for Large-Scale Heterogeneous Multimedia Data: A Survey

专知会员服务

32+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知

16+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

春节充电系列：李宏毅2017机器学习课程学习笔记19之迁移学习（Transfer Learning）

春节充电系列：李宏毅2017机器学习课程学习笔记19之迁移学习（Transfer Learning）

专知

9+阅读 · 2018年3月5日

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data

Arxiv

8+阅读 · 2021年6月10日

Big Self-Supervised Models are Strong Semi-Supervised Learners

Arxiv

6+阅读 · 2020年10月26日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

S$^\mathbf{4}$L: Self-Supervised Semi-Supervised Learning

Arxiv

5+阅读 · 2019年5月9日

Exploring RNN-Transducer for Chinese Speech Recognition

Arxiv

4+阅读 · 2019年4月23日

Transfer Learning with Neural AutoML

Arxiv

5+阅读 · 2018年9月11日

MLtuner: System Support for Automatic Machine Learning Tuning

Arxiv

3+阅读 · 2018年3月20日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

Arxiv

8+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

自动语音识别

最新内容

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

专知会员服务

4+阅读 · 7月23日

《基于强化学习的自动化红队测试》

《基于强化学习的自动化红队测试》

专知会员服务

3+阅读 · 7月23日

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

专知会员服务

6+阅读 · 7月23日

“天降毒雾”：无人机如何使化学战重返乌克兰战场

“天降毒雾”：无人机如何使化学战重返乌克兰战场

专知会员服务

2+阅读 · 7月23日

伊朗不对称防空战略的演进

伊朗不对称防空战略的演进

专知会员服务

4+阅读 · 7月23日

对抗环境下超视距目标打击的情报支援

对抗环境下超视距目标打击的情报支援

专知会员服务

10+阅读 · 7月22日

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

专知会员服务

4+阅读 · 7月22日

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

专知会员服务

8+阅读 · 7月22日

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

专知会员服务

10+阅读 · 7月22日

《无人机对海面作战影响评估》

《无人机对海面作战影响评估》

专知会员服务

15+阅读 · 7月21日

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

专知会员服务

15+阅读 · 7月21日

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

专知会员服务

4+阅读 · 7月21日

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

专知会员服务

6+阅读 · 7月21日

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

专知会员服务

9+阅读 · 7月21日

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

7+阅读 · 7月20日

相关VIP内容

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

62+阅读 · 2020年5月9日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

专知会员服务

148+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【伯克利 | 情感计算】大规模异构多媒体数据的情感计算:综述论文，37页pdf，171篇参考文献，Affective Computing for Large-Scale Heterogeneous Multimedia Data: A Survey

【伯克利 | 情感计算】大规模异构多媒体数据的情感计算:综述论文，37页pdf，171篇参考文献，Affective Computing for Large-Scale Heterogeneous Multimedia Data: A Survey

专知会员服务

32+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于强化学习的自动化红队测试》

“天降毒雾”：无人机如何使化学战重返乌克兰战场

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

相关资讯

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知

16+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

春节充电系列：李宏毅2017机器学习课程学习笔记19之迁移学习（Transfer Learning）

春节充电系列：李宏毅2017机器学习课程学习笔记19之迁移学习（Transfer Learning）

专知

9+阅读 · 2018年3月5日

相关论文

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data

Arxiv

8+阅读 · 2021年6月10日

Big Self-Supervised Models are Strong Semi-Supervised Learners

Arxiv

6+阅读 · 2020年10月26日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

S$^\mathbf{4}$L: Self-Supervised Semi-Supervised Learning

Arxiv

5+阅读 · 2019年5月9日

Exploring RNN-Transducer for Chinese Speech Recognition

Arxiv

4+阅读 · 2019年4月23日

Transfer Learning with Neural AutoML

Arxiv

5+阅读 · 2018年9月11日

MLtuner: System Support for Automatic Machine Learning Tuning

Arxiv

3+阅读 · 2018年3月20日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

Arxiv

8+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员