About Voice: A Longitudinal Study of Speaker Recognition Dataset Dynamics

Like face recognition, speaker recognition is widely used for voice-based biometric identification in a broad range of industries, including banking, education, recruitment, immigration, law enforcement, healthcare, and well-being. However, while dataset evaluations and audits have improved data practices in computer vision and face recognition, the data practices in speaker recognition have gone largely unquestioned. Our research aims to address this gap by exploring how dataset usage has evolved over time and what implications this has on bias and fairness in speaker recognition systems. Previous studies have demonstrated the presence of historical, representation, and measurement biases in popular speaker recognition benchmarks. In this paper, we present a longitudinal study of speaker recognition datasets used for training and evaluation from 2012 to 2021. We survey close to 700 papers to investigate community adoption of datasets and changes in usage over a crucial time period where speaker recognition approaches transitioned to the widespread adoption of deep neural networks. Our study identifies the most commonly used datasets in the field, examines their usage patterns, and assesses their attributes that affect bias, fairness, and other ethical concerns. Our findings suggest areas for further research on the ethics and fairness of speaker recognition technology.

翻译：与面部识别类似，说话人识别被广泛应用于银行、教育、招聘、移民、执法、医疗健康及福祉等多个行业的语音生物特征识别中。然而，尽管数据集评估与审计已改善了计算机视觉和面部识别领域的数据实践，但说话人识别中的数据实践在很大程度上仍未受到质疑。本研究旨在通过探讨数据集使用如何随时间演变及其对说话人识别系统中偏见与公平性的影响来填补这一空白。此前研究已证实主流说话人识别基准中存在历史性、表征性和测量性偏见。本文对2012年至2021年间用于训练与评估的说话人识别数据集进行了纵向研究。我们调查了近700篇论文，以考察在说话人识别方法逐渐转向深度神经网络广泛采用的关键时期内，数据集在学术界的采用情况及其使用变化。本研究识别出该领域最常用的数据集，分析其使用模式，并评估影响偏见、公平性及其他伦理问题的数据集属性。我们的发现为说话人识别技术伦理与公平性的进一步研究指明了方向。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日