Advancing the study of Large-Scale Learning in Overlapped Speech Detection - 专知论文

会员服务 ·

0

Performer · MoDELS · Learning · 情景 · 词性标注 ·

2023 年 8 月 11 日

Advancing the study of Large-Scale Learning in Overlapped Speech Detection

翻译：推进大规模学习在重叠语音检测中的研究

Zhaohui Yin,Jingguang Tian,Xinhui Hu,Xinkang Xu

Overlapped Speech Detection (OSD) is an important part of speech applications involving analysis of multi-party conversations. However, Most of the existing OSD models are trained and evaluated on specific dataset, which limits the application scenarios of these models. In order to solve this problem, we conduct a study of large-scale learning (LSL) in OSD and propose a more general 16K single-channel OSD model. In our study, 522 hours of labeled audio in different languages and styles are collected and used as the large-scale dataset. Rigorous comparative experiments are designed and used to evaluate the effectiveness of LSL in OSD task and the performance of OSD models based on different deep neural networks. The results show that LSL can significantly improve the performance and robustness of OSD models, and the OSD model based on Conformer (CF-OSD) with LSL is currently the best 16K single-channel OSD model. Moreover, the CF-OSD with LSL establishes a state-of-the-art performance with a F1-score of 80.8% and 52.0% on the Alimeeting test set and DIHARD II evaluation set, respectively.

翻译：重叠语音检测（OSD）是多说话人对话分析中语音应用的重要组成部分。然而，现有大多数OSD模型均在特定数据集上训练和评估，这限制了这些模型的应用场景。为解决该问题，我们开展了大规模学习（LSL）在OSD中的研究，并提出更通用的16K单通道OSD模型。研究中收集了522小时不同语言与风格的标注音频作为大规模数据集，通过设计严格的对比实验评估LSL在OSD任务中的有效性，以及基于不同深度神经网络的OSD模型性能。结果表明，LSL能显著提升OSD模型的性能与鲁棒性，其中基于Conformer并采用LSL的OSD模型（CF-OSD）当前为最优16K单通道OSD模型。此外，采用LSL的CF-OSD在Alimeeting测试集和DIHARD II评估集上分别取得了80.8%和52.0%的F1分数，达到当前最优性能。

0

相关内容

Performer

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Design of Chain-of-Thought in Math Problem Solving

Arxiv

0+阅读 · 2023年9月30日

Benchmarking Cognitive Biases in Large Language Models as Evaluators

Arxiv

0+阅读 · 2023年9月29日

Vistrust: a Multidimensional Framework and Empirical Study of Trust in Data Visualizations

Arxiv

0+阅读 · 2023年9月29日

On the Generalization and Approximation Capacities of Neural Controlled Differential Equations

Arxiv

0+阅读 · 2023年9月28日

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Arxiv

0+阅读 · 2023年9月28日

Respondent-Driven Sampling: An Overview in the Context of Human Trafficking

Arxiv

0+阅读 · 2023年9月28日

Differentially Private Secure Multiplication: Hiding Information in the Rubble of Noise

Arxiv

0+阅读 · 2023年9月28日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

37+阅读 · 2021年8月2日

A Probabilistic Representation of DNNs: Bridging Mutual Information and Generalization

Arxiv

17+阅读 · 2021年6月18日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

VIP会员

文章信息

相关主题

最新内容

CVPR 2026教程｜扩散模型原理：连续、离散与实时生成

CVPR 2026教程｜扩散模型原理：连续、离散与实时生成

专知会员服务

1+阅读 · 今天13:30

重磅综述｜大模型智能体环境工程：建模、合成、评估与协同演化

重磅综述｜大模型智能体环境工程：建模、合成、评估与协同演化

专知会员服务

1+阅读 · 今天13:28

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

专知会员服务

5+阅读 · 今天7:54

《多域战场上反制小型无人机系统》150页

《多域战场上反制小型无人机系统》150页

专知会员服务

14+阅读 · 今天7:47

《基于成果军事教育框架下的军官联合职业军事教育认证程序》2026最新170页

《基于成果军事教育框架下的军官联合职业军事教育认证程序》2026最新170页

专知会员服务

5+阅读 · 今天7:43

战场人工智能：增强陆地作战能力的发现与要求

战场人工智能：增强陆地作战能力的发现与要求

专知会员服务

3+阅读 · 今天7:37

人工智能赋能指挥所：以人工智能为中心的指挥控制的核心要素

人工智能赋能指挥所：以人工智能为中心的指挥控制的核心要素

专知会员服务

5+阅读 · 今天7:33

以人工智能为中心的指挥控制

以人工智能为中心的指挥控制

专知会员服务

3+阅读 · 今天7:14

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

专知会员服务

4+阅读 · 今天4:15

俄乌冲突背景下军事特种公路运输日益增长的重要性

俄乌冲突背景下军事特种公路运输日益增长的重要性

专知会员服务

4+阅读 · 今天3:44

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

专知会员服务

9+阅读 · 6月10日

《基于深度强化学习的反无人机技术研究》178页

《基于深度强化学习的反无人机技术研究》178页

专知会员服务

13+阅读 · 6月10日

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

专知会员服务

8+阅读 · 6月10日

“史诗怒火”行动与“AI中心战”模式的浮现

“史诗怒火”行动与“AI中心战”模式的浮现

专知会员服务

14+阅读 · 6月10日

【CVPR2026教程】扩散模型的解析理解

【CVPR2026教程】扩散模型的解析理解

专知会员服务

5+阅读 · 6月10日

相关VIP内容

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

重磅综述｜大模型智能体环境工程：建模、合成、评估与协同演化

《多域战场上反制小型无人机系统》150页

CVPR 2026教程｜扩散模型原理：连续、离散与实时生成

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Design of Chain-of-Thought in Math Problem Solving

Arxiv

0+阅读 · 2023年9月30日

Benchmarking Cognitive Biases in Large Language Models as Evaluators

Arxiv

0+阅读 · 2023年9月29日

Vistrust: a Multidimensional Framework and Empirical Study of Trust in Data Visualizations

Arxiv

0+阅读 · 2023年9月29日

On the Generalization and Approximation Capacities of Neural Controlled Differential Equations

Arxiv

0+阅读 · 2023年9月28日

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Arxiv

0+阅读 · 2023年9月28日

Respondent-Driven Sampling: An Overview in the Context of Human Trafficking

Arxiv

0+阅读 · 2023年9月28日

Differentially Private Secure Multiplication: Hiding Information in the Rubble of Noise

Arxiv

0+阅读 · 2023年9月28日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

37+阅读 · 2021年8月2日

A Probabilistic Representation of DNNs: Bridging Mutual Information and Generalization

Arxiv

17+阅读 · 2021年6月18日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员