描述符：用于合成语音检测与说话人识别的扩展时长音频数据集（ELAD-SVDSR） (Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR))

This paper introduces the Extended Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD SVDSR), a resource specifically designed to facilitate the creation of high quality deepfakes and support the development of detection systems trained against them. The dataset comprises 45 minute audio recordings from 36 participants, each reading various newspaper articles recorded under controlled conditions and captured via five microphones of differing quality. By focusing on extended duration audio, ELAD SVDSR captures a richer range of speech attributes such as pitch contours, intonation patterns, and nuanced delivery enabling models to generate more realistic and coherent synthetic voices. In turn, this approach allows for the creation of robust deepfakes that can serve as challenging examples in datasets used to train and evaluate synthetic voice detection methods. As part of this effort, 20 deepfake voices have already been created and added to the dataset to showcase its potential. Anonymized metadata accompanies the dataset on speaker demographics. ELAD SVDSR is expected to spur significant advancements in audio forensics, biometric security, and voice authentication systems.

翻译：本文介绍了用于合成语音检测与说话人识别的扩展时长音频数据集（ELAD-SVDSR），该资源专为促进高质量深度伪造音频的生成，并支持开发针对此类伪造音频的检测系统而设计。该数据集包含36名参与者各45分钟的音频录音，每名参与者在受控条件下朗读不同报纸文章，并通过五种不同质量的麦克风采集。通过聚焦于扩展时长的音频，ELAD-SVDSR能够捕捉更丰富的语音属性，如基频轮廓、语调模式和细微的表达方式，从而使模型能够生成更真实、连贯的合成语音。相应地，这种方法有助于创建鲁棒的深度伪造音频，可作为训练和评估合成语音检测方法的数据集中具有挑战性的样本。作为本项工作的一部分，目前已创建20个深度伪造语音并添加至数据集中，以展示其潜力。数据集附带有经过匿名化处理的说话人人口统计学元数据。ELAD-SVDSR有望推动音频取证、生物特征识别安全和语音认证系统领域的显著进展。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日